In this assignment you will practice the following computer science concepts:
In this lab, we will learn how to use a large data set for analyzing the data, or finding out what the data means.
Working with large data sets can be challenging if we just look at the raw data, which typically consists of a large list of numbers. In order to find out what the data means, you need to use a computer program with mathematical calculation to find out the connection and trends in the data set.
In this exercise, we will use data on atmospheric carbon dioxide levels from NASA. First, you will conduct some Internet research. Second, you will learn how to organize data so that it makes more sense. Third, you will learn how to make a growth rate calculator using a program you write. Through building this computer program, we will also learn how mathematical formula(s) help in analyzing large data sets.
In this section, we will research the data set for "atmospheric carbon dioxide level," using the Internet. Type the phrase in a search engine (e.g., Google, Bing) and see if you can find raw data. Raw data will consist of a list of numbers, instead of a graph.
The following are sites you can use, if your search is unsuccessful:
Atmospheric Carbon Dioxide Level
When you look at the raw data page on "atmospheric carbon dioxide level" from NASA (the file starts with "USE OF NOAA ESRL DATA"), you may find that this data set is very complicated or even intimidating. The page consists of the explanation of how to read the data at the beginning part, then it shows you rows and columns of numbers.
We will focus on the numbers in middle column titled "average." The numbers in this column tell you how much CO2 is in the air (expressed as parts per million (ppm)) in average for each month from March 1959 to June 2015.
When looking at the raw data from the middle column "average," you see the number increases from past to present. The data has an increasing trend. We would like to find out more about this trend. Does the rate of increase consistently from decade to decade? Or does it change in some decades?
The traditional pen and paper method for finding this trend data is:
You can write a computer program, or customized calculator, to make this process much easier. Once you write the program, you will just need to input the raw data, and the calculator will do the rest. No more multiple calculations, no more pen and paper.
// You will need the use statement for the Math class from Quorum Libraries, and to instantiate a Math object to be used later use Libraries.Compute.Math Math math // These lines ask the user for information on the month of January text jan = input("Enter the number for January for the year one of the decade.") text jan10 = input("Enter the number for January for the year ten of the decade.")
Write the input statements for the rest of the months.
You now have all the data you need from the user input, with one small problem.
The input command stores information as text, we need numbers to be able to analyze the information.
To do this, we
cast the information as a number. This action passes the parameter of (variable type, variable name).
So we put in the variable type we want, in this case number, and then the name of the variable we want to change.
We also have to make new number variables for this to work.
// These lines create two new number variables to hold our January input, and cast that text input as numbers number janN = cast(number, jan) number jan10N = cast(number, jan10)
Cast the rest of the input statements to numbers.
We are now ready to find the averages of our data. To find the average of a set of data, add all the data points, and then divde by the number of points in the set. In this case, there are 12 points.
number avg1 = (janN + febN + marN + aprN + mayN + junN + julN + augN + septN + octN + novN + decN) / 12
Calculate the average of the year 10 data.
You can now round averages. To do this, we will use the
Round action from the
Math library. Notice that the
math object starts with a lower case "m." The
Round action passes the parameters (variable, number of decimal points). We want to round to the nearest hundredth, so we put a 2 after the comma.
number roundedAvg1 = math:Round(avg1, 2)
Round the average of the year 10 data.
To find the percent increase between two numbers, subtract the smaller number from the larger number, divide the answer by the larger number. This will give you a decimal answer. To make it a percent, multiply by 100.
number percentIncrease = ((higherAvg - lowerAvg) / higherAvg) * 100
Write the code to find the percent increase.
Use your rounded averages. Then round the percent increase. Finally, write a concatenated output statement to display the answer.
"In this decade, the carbon dioxide level in the Earth's atmosphere increased by 2.43 percent."
Use the program to find out the increase rate in multiple different decade periods, such as the rate in the 1960's and the rate in the 2000's. Write down each finding in a separate sheet of paper.
In the data set, there are numbers in some month listed as -99.99. This simply means that the researcher could not obtain a valid data on that month. In that case, just use the data from the previous month for the input for your program.
The increase rate that you acquire from your program is a percentage. 3 percent increase means that the CO2 level in the air has increased 3/100 measures from the year 1 of the decade to year 10 of the decade. If the year 1 measurement is 100.00, the 3 percent increase means that you have 103.00 in the year 10 of the decade. Thus, the larger the number of increase rate, the more CO2 molecule has emitted into the Earth's atmosphere during that period (decade).
Does your finding show a consistent increase? If it shows the fluctuation, which decade has higher rate or lower rate?
Now we have an understanding of the trend and the pattern in the increase of atmospheric carbon dioxide level. The next question should be what is causing this trend and the pattern? Is it a naturally occurring phenomena? Or is it caused by some other factors? Is it maybe by the increase of human population on Earth, decrease of forestation on Earth, increase in the use of motor vehicle on Earth?
Is there any other data set that shows a proportional relationship with the data analysis that you acquired from this data analysis exercise? You can tell the two data sets have a proportional relationship if the two data sets have a matching increasing trend and pattern with the similar rate, or the two data sets have inverse proportionate relationship – one data set increases and one data sets decreases in a similar trend and pattern with a similar rate.
Within a small group, discuss and hypothesize (making an educated estimation) in regard to what is causing the increase in atmospheric carbon dioxide level on Earth. Then try to find the data that support your group's hypothesis.
Try to find the data
In the next tutorial, we will discuss Challenge 2.1, which describes how a Musical Piece works in Quorum..