Making a Scatter Plot
The next chart we will be learning to create would be a scatter plot. Scatter plots are used to observe relationships between two variables. An example of this would be comparing the heights and diameters of trees, where the position of each dot correlates to that specified height and diameter. The data as a whole can represent a relationship: strong positive/negative linear, moderate positive/negative linear, or no relationship. As a data scientist, one of the most important patterns is seeing how points cluster from other points, if there are any gaps within the dataset, and identifying outliers for within the set. We focus on these aspects to make predictions on future datasets and understand trends.
This next dataset we are using to create a scatter plot is about dog breeds and their traits. The chart will compare maximum life spans to maximum weights to see how they're related.
To follow along, we can download the dogs dataset here.
Here is a snippet of what the dataset should look like:
|Breeding Group||Maximum Life Span||Maximum Weight|
Loading and Formatting
As mentioned previously, to load and read in the dataset, we will need to create a DataFrame component named frame. Using the frame, we must use the Load function and type in the file path of the dogs CSV. Recall that a CSV is a comma separated text file that holds in data.
use Libraries.Compute.Statistics.DataFrame use Libraries.Interface.Controls.Charts.ScatterPlot DataFrame frame frame:Load("data/Dogs.csv")
Note that, we stored this dataset in a data folder.
Once the data has been loaded in, we will now extract this data to use in the data chart. We will be using two functions from our frame component, AddSelectedColumns(text heading) and AddSelectedFactors(text heading). In this instance, because scatter plots need two data variables for comparisons, we will be extracting the maximum life span and weight for the dogs using our AddSelectedColumns(text header) function. We will also be extracting dog breeds with our AddSelectedFactors(text header) function to discriminate the two variables through the dots. The usage of these two functions are shown below:
|frame:AddSelectedColumns(text heading)||AddSelectedColumns() takes in a string that matches a column heading from our dataset. This function is used to format our axises. For this tutorial, we will be calling this function twice and extract "Maximum Life Span" and "Maximum Weight."||frame:AddSelectedColumns("heading")|
|frame:AddSelectedFactors(text heading)||AddSelectedFactors() takes in a string that matches a column heading from our dataset. This function is used to label our dots and form the legend based off of the two variables we are comparing. For this tutorial, we will be extracting "Breed Group."||frame:AddSelectedFactors("heading")|
We should have the following code:
// pull out selected data frame:AddSelectedFactors("Breed Group") frame:AddSelectedColumns("Maximum Life Span") frame:AddSelectedColumns("Maximum Weight")
Now it is time to create the Scatter Plot which can be done with the following code. This creates a chart object from our DataFrame component, frame. The rest of this lesson, we will be using the chart object we have created to change and format the rest of our scatter plot.
// using the data frame, format data by creating a scatter plot chart component ScatterPlot chart = frame:ScatterPlot() chart:Display()
Example of loading our data and creating scatter plot object
Calling the Display() function will give us a pop-up of our formatted data so far. We still need to give meaning to our data, therefore, the following steps will show us how to label and customize our chart.
Labeling the Scatter Plot
In order for viewers to understand our data, labels give a clear comprehension of what is being presented. This means that we will be labeling the x axis, y axis, legend, and giving our chart a title that describes the dataset. To do so, we will call the following functions with our "chart" object: SetTitle(text name), SetXAxisTitle(text name), SetYAxisTitle(text name), SetLegendTitle(text name), and SetSubtitle(text name). Here is a brief description on what each function does and what it takes in.
|SetTitle(text name)||SetTitle() takes in a string as a parameter, which would be the title of the chart. For this example, we will name the chart "Dog Weight and Life Span"||chart:SetTitle("Dog Weight and Life Span")|
|SetXAxisTitle(text name)||SetXAxisTitle() takes in a string as a parameter, which would be the label of the x axis. For this example, we will label this section "Maximum Life Span"||chart:SetXAxisTitle("Maximum Life Span"|
|SetYAxisTitle(text name)||SetYAxisTitle() takes in a string as a parameter, which would be the label of the y axis. For this example, we will label this section "Maximum Weight (pounds)." This is also a good section to label the unit we are comparing, such as pounds.||chart:SetYAxisTitle("Maximum Weight")|
|SetLegendTitle(text name)||SetLegendTitle() takes in a string as a parameter, which would label the legend of the chart. The legend identifies the separate ages for the dots. For this example, we will label the legend "Breed Group"||chart:SetLegendTitle("Breed Group")|
|SetSubtitle(text title)||SetSubtitle() takes in a string as a parameter which would set a subtitle under the title. This can be any short description or any other necessary information for our chart. For this example, we will label the subtitle "Does weight correlate to life span for dogs?"||chart:SetSubtitle("Does weight correlate to life span for dogs?")|
// label your scatter plot chart:SetXAxisTitle"Maximum Life Span (years)") chart:SetYAxisTitle("Maximum Weight (pounds)") chart:SetLegendTitle("Breed Group") chart:SetSubtitle("Does weight correlate to life span for dogs?") chart:SetTitle("Dog Weight and Life Span")
Note, if we would like to see the data chart so far, we can type "chart:Display()" to view it with the labels we created.
Example of labeling our scatter plot
Customizing the Data Chart
Now that we have our data labeled, we can customize our data to our liking, such as adjusting the intervals, changing starting values, and changing the color. We will be playing around with all these features and to do so, we will be again, using our chart object to call these functions. The functions we will be using for this would be: SetLegendLocationToBottom(), SetColorPaletteToDisurbing(), SetFontSize(integer size), FlipOrientation(), and ShowLinearRegression(bool). Here are brief descriptions on what each function does and how to use it.
|SetLegendLocationToBottom()||SetLegendLocationToBottom() takes in no parameters, which would be the directions, but will place the legend in a specificed place. For this example, we will place the legend on the bottom. Alternatively, you could also use SetLegendLocationToTop(), SetLegendLocationToLeft() or SetLegendLocationToRight().||chart:SetLegendLocationToBottom()|
|SetFontSize(integer size)||SetFontSize() takes in an integer as a parameter and will set the font size on all text based on the desired input. For this tutorial, we will insert 30 as the font size.||chart:SetFontSize(30)|
|SetColorPaletteToDisurbing()||SetColorPaletteToDisurbing() takes in no parameters, but will adjust the color palette based off of yellows, browns, oranges, and greens||chart:SetColorPaletteToDisturbing()|
|ShowLinearRegression(boolean)||takes in a true or false value (boolean), and wll show the regression lines and equations for the chart||chart:ShowLinearRegression(true)|
|FlipOrientation()||FlipOrientation() takes in no parameters, and this function will swap the places of the x and y axis.||chart:FlipOrientation()|
// set the legend location, choices are left, right, top and bottom chart:SetLegendLocationToBottom() // color palette contains yellows, oranges, browns, and greens chart:SetColorPaletteToDisturbing() // adjust font size by preference, here we set it to 30 pt chart:SetFontSize(30) // if we would like to switch the x and y axis chart:FlipOrientation()
Example of customizing our scatter plot
Congratulations, our Scatter Plot is constructed! Now we can display the chart with the Display() function. There are two ways to do this, letting it automatically display and specifying a specific window size. By doing chart:Display() it will display in a size equal to the screen size. By doing chart:Display(num, num), it will display the chart in a respected constraint window size. We will be using the specified display.
Now, feel free to clean, build, and run our program and we shortly should see a Game window pop-up. This is our Scatter Plot! To view the entire code, click here to view the file.
Full Example of the Scatter Plot