Data Science Hour of Code
Activity 3: Selecting the DataScenario:
- You're a data scientist working with a group of researchers from Antarctica.
- They have collected a bunch of data about penguins, but need help answering questions about it.
- Your job is to make some charts from the data and answer some of their questions.
Introduction:
Now that we're getting the hang of it, we can start to explore more data. Often times, data scientist have large datasets with many different columns, and they might investigate how different variables have different relationships. In the Table below, we can see a sample of Penguins2.csv. This dataset is a lot like our Penguins1.csv, except it has a few more columns. For each entry, we can look at the species of the penguin (Adelie, Gentoo, Chinstrap), which island it lives on (Torgersen, Biscoe, Dream), and some measurements like bill length, bill depth and flipper length.
species | island | bill_depth | bill_length | flipper_length |
---|---|---|---|---|
Adelie | Torgersen | 18.7 | 39.1 | 181 |
Adelie | Biscoe | 18.3 | 37.8 | 174 |
Adelie | Dream | 18.5 | 36.8 | 193 |
Gentoo | Biscoe | 13.2 | 46.1 | 211 |
Chinstrap | Dream | 17.9 | 46.5 | 192 |
Instructions:
In the code editor below, we have a program that makes a Chart. Take code blocks from the palette and place them below where we Load the .csv, but before we make the Chart object.
- Use the block(s) in the palette on the left.
- Place the 'frame:AddSelectedColumns("bill_depth,bill_length,flipper_length")' block below the 'frame:Load("data/Penguins2.csv")' block in the block editor.
- Run the program.
- Use the chart in the canvas to answer the questions in the Activity section.
Coding:
Blocks
Activity:
Use the chart(s) you've created in the Coding section to answer a few questions.
Next Tutorial
In the next tutorial, we will discuss Selecting a Factor, which describes how to split our data into groups based on Factors..