Overview

Analyzing and interpreting data will typically require some assumptions to be made about the accuracy of the data and the cause of the relationships observed within it. When decisions are made based on a collection of data, they will often rest just as much on that set of assumptions about the data as the data itself. Learning to validate and clearly call out assumptions being made when interpreting data is an important part of both analyzing and communicating about data.

Goals

Students will be able to:

Purpose

In this lesson the students will look deeper into why they should separate the what from the why when looking at data. The main purpose here is to raise awareness of the assumptions that we (all people) make when looking at data and try to call them out. Some of these assumptions lie hidden beneath the surface and we want to shed some light on them by looking at some examples from the news. This is a useful mode of reflection that will serve the students well when doing reflective writing on the performance tasks. Analyzing and interpreting data will typically require some assumptions to be made about the accuracy of the data and the cause of the relationships observed within it. When decisions are made based on a collection of data, they will often rest just as much on that set of assumptions about the data as the data itself. Identifying and validating (or disproving) assumptions is therefore an important part of data analysis. Furthermore, clear communication about how data was interpreted should also include an account of the assumptions made along the way.

Resources

Links for Getting Started

Activity Guide

Links for Activity

Links for Extended Learning

Getting Started

Show the Google Flu Trends video below or describe the general purpose of Google Flu Trends (which is to predict outbreaks of the flu) because Google Flu Trends are no longer updating their flu trend estimates. Discuss with the students the following:

Ask the students to share their responses in small groups or as a class.

Ask the students to read the following articles and discuss (1) why did Google Flu Trends eventually fail? (2) what assumptions did they make about their data or their model that ultimately proved not to be true?

Some of the key points from the articles that the students should understand include:

Activity

The "Digital Divide"

Using the activity guides, discuss the Digital Divide and the powerpoint by Lee Rainie (the powerpoint is also linked in the activity guide), with the students. Some of the key points for discussion include:

Identifying Assumptions in Data Analysis

The goal of this activity is to allow the students to practice identifying possible assumptions that can cause biased conclusions from data. Follow the instructions in the Activity Guide to help them through the process.

Wrap Up

Ask the students to share what they have done on their activity guide. What trend did they find interesting and why? How did they interpret the data? Are there any other ways that the data could be interpreted? What would they investigate further?

Assessment

Which of the following is the most accurate description of what is known as the "digital divide"

The digital divide is about how...

Personal Reflection

Consider the following statement from the CS Principles course framework:

7.4.1 C The global distribution of computing resources raises issues of equity, access, and power.

Briefly describe one of these issues that you learned about in the lesson and how it affects your life or the lives of people you know. Keep your response to about 100 words (about 3-5 sentences).

Extended Learning

Read the article by Danny Page from Medium to discuss with the students how journalists have misinterpreted Google Trends data.

Standards Alignment