Data - Lesson 5: Big, Open, Crowdsourced Data

Overview

Students will complete a jigsaw of three different topics at the intersection of data, computing, and global impacts. These are topics, big data, crowdsourcing, and open data. Students will watch videos or listen to audio recordings about the different topics. Groups will each complete an activity guide about their topic before having individuals from each group share out their findings. The lesson concludes with a review of key points.

Goals

Students will be able to:

  • Define and explain the impacts of crowdsourcing, crowdfunding, and citizen science
  • Explain why in some contexts large amounts of data need to be analyzed in parallel and scalable systems
  • Explain the impact of open data on scientific research and discovery

Purpose

This lesson zooms back out from the data analysis process to the ways that is applied in a wide variety of contexts. Students learn how big data, open data, and crowdsourcing apply this process in interesting ways that cleverly modify this process. For a summary of key points of this lesson review the key takeaways in the slides. In short however:

  • Big data: "Collect huge amounts of data so we can learn even more from it"
  • Open data: "sharing data with others so they can can analyze it"
  • Crowdsourcing: "collecting data from others so you can analyze it"

This lesson further builds towards the following lesson on machine learning which explores a different application of the data analysis process.

Resources

Activity (30 mins)

Group: Place students in pairs

Teaching Tip

Complete the Activity Digitally: Students will have a much easier time accessing articles and videos if they complete the activity digitally. Digital activities can be more accessible.

Supporting the Jigsaw: In this lesson students do a jigsaw of a number of different topics. Students will need access to computers and should spend roughly 10 minutes in each group listening to audio / video content. During this period circulate the room encouraging them to focus on the questions they've been asked to respond to. This will also help you anticipate or even specifically ask different students to participate during the discussion.

Distribute: Give each pair a copy of the Big, Open, and Crowdsourced Data - Activity Guide

Prompt: With a partner Choose one of the topics Watch the related videos / listen to the podcasts * Take notes and be ready to share responses to the questions on your activity guide

Discuss: Have members from each topic share the conclusions from their watching and research. Make sure that students from each group have time to share

Vocabulary List

  • Scalability
  • Parallel Systems
  • Citizen Science
  • Crowdsource
  • Open Data
  • Open Access

Tasks to Complete

Take some time to review the terms from the vocabulary list and be prepared to take notes regarding these subjects.

  • What the topic is
  • The key vocabulary they were responsible for researching in the vocab list above
  • How this concept uses or modifies the data analysis process
  • Examples of the problems this technique is being used to solve

Wrap up (5 Minutes)

Review the key takeaways

Open Data

  • Sharing data with others so they can can analyze it
  • Open data is publicly available data shared by governments, organizations, and others
  • Making data open help spread useful knowledge or creates opportunities for others to use it to solve problems

Citizen Science and Crowdsourcing

  • Collecting data from others so you can analyze it
  • Crowdsourcing is the practice of obtaining input or information from a large number of people via the Internet.
  • Citizen science is research where some of the data collection is done by members of the public using own computing devices which leads to solving scientific problems
  • Crowdsourcing offers new models for collaboration, such as connecting businesses or social causes with funding
  • Both are examples of how human capabilities can be enhanced by collaboration via computing

Big Data

  • Collect huge amounts of data so we can learn even more from it
  • The size of the datasets we analyzed impacts how much information can be extracted
  • As a result, in business, science, and many other contexts people are working with increasingly big data sets
  • When data gets too big it can no longer be processed on one computer. Cloud computing or parallel systems are sometimes used to help process all that information.
  • In general scalability of your system is important to consider when working with big data. You want your system to be able to work even as you're using more and more data.

Assessment: Check for Understanding

For Teachers

Assess: You can collect and evaluate students' activity guides.

Standards Alignment

  • CSTA K-12 Computer Science Standards (2017): 3A-DA-10
  • CSP2021: DAT-2.C.6, DAT-2.C.7, DAT-2.C.8
  • CSP2021: IOC-1.E.1, IOC-1.E.2, IOC-1.E.3, IOC-1.E.4, IOC-1.E.5, IOC-1.E.6

Next Tutorial

In the next tutorial, we will discuss Code.org Unit 9, which describes explore innovations in everyday life.