Data Science Hour of Code

Activity 4: Selecting a Factor

Scenario:

  • You're a data scientist working with a group of researchers from Antarctica.
  • They have collected a bunch of data about penguins, but need help answering questions about it.
  • Your job is to make some charts from the data and answer some of their questions.

Introduction:

You may have noticed in our Penguins2.csv dataset, some of the columns have values that are numbers (bill_length, bill_depth, flipper_length) and some have values that are text (species, island). Let's say we wanted to look at the relationship between bill_depth and bill_length, like we did in Activity 1, but we also want to break that data down into smaller groups based a Factor like species. This would show us three comparisons of bill_depth versus bill_length for the Adelie, Gentoo and Chinstrap penguins.

Sample of Penguins2.CSV file
speciesislandbill_depthbill_lengthflipper_length
AdelieTorgersen18.739.1181
AdelieBiscoe18.337.8174
AdelieDream18.536.8193
GentooBiscoe13.246.1211
ChinstrapDream17.946.5192

Instructions:

In the code editor below, we have a program that makes a Chart looking at Bill Depth vs Bill Length. Take code blocks from the palette and place them below where we Load the .csv, but before we make the Chart object.

  1. Use the block(s) in the palette on the left.
  2. Place the 'frame:AddSelectedFactors("species")' block below the 'frame:Load("data/Penguins2.csv")' block in the block editor.
  3. Run the program.
  4. Use the chart in the canvas to answer the questions in the Activity section.

Coding:

Blocks

Activity:

Use the chart(s) you've created in the Coding section to answer a few questions.











Next Tutorial

In the next tutorial, we will discuss Customizing the Colors, which describes how to customize the colors in the Chart..