Making a Box Plot

The next chart to learn about is the box plot. For data scientists, box charts are useful to display data distribution separated by quartiles, meaning highlighting upper, lower, and middle ranged values. Box plots also show viewers data outliers in relation to the upper and lower extremes so we can examine what is different from the norm. Box plots are a great resource for data analytics in order to see the distribution of numerical data, similar to a Histogram.

In this dataset, we will compare the height of males and females in the world. It will compare the height between the sexes based on the data from 199 countries collected in 2022.

To follow along, we can download the Height of Male and Female by Country 2022 dataset here.

Here is a snippet of what the dataset should look like:

Height of Male and Female by Country 2022 CSV
Country Name Male Height in Cm Female Height in Cm
Netherlands183.78170.36
Montenegro183.30169.96
Estonia182.79168.66
Bosnia and Herzegovina182.47167.47
Iceland182.10168.91

Loading and Formatting

As mentioned previously, to load and read in the dataset, we will need to create a DataFrame component named 'frame'. Using the frame, we must use the Load function and type in the file path of the insurance CSV. Recall that a CSV is a comma separated text file that holds in data.


use Libraries.Compute.Statistics.DataFrame
use Libraries.Interface.Controls.Charts.BoxPlot

/*
    This is an example of a simple box plot in quorum.
    The data collected is the average height of 2 sexes in 199 countries.
*/

// create dataframe to read in data
DataFrame frame
frame:Load("../Data/Population/Height of Male and Female by Country 2022.csv")

Note that, we stored this dataset in a Data folder, and contained in that folder is an inner folder named Population.

Once the data has been loaded in, we will now extract this data to use in the data chart. We will be using two functions from our frame component, AddSelectedColumns(text header), and AddSelectedFactors(text header) where the columns will be used to label our x-axis, signifying the groups we are observing and the factor will be used to label our y-axis, signifying the change over time. AddSelectedColumn(text header) and AddSelectedFactor(text header) take in a parameter of either the column number or the column label in the CSV file. We will be using the column number to demonstrate.

Please note that for a BoxPlot, we will only be using the AddSelectedColumns(text header) because of the simplicity of the dataset and this type of chart. This goes for other datasets as well, in which we will not be modifying the y axis. AddSelectedColumns(text header) will take a string as the parameter which represents the text header in the data file.

Adding CSV columns onto Charts
Function Description Usage
frame:AddSelectedColumns(text heading)AddSelectedColumns() takes in a string that matches a column heading from our dataset. This function is used to format our axises. For this tutorial, we will be calling this function twice and extract "Male Height in Cm" and "Female Height in Cm."frame:AddSelectedColumns("heading")

We should have the following code:


// pull out selected data, for this we will be categorizing by Male and Female Height
frame:AddSelectedColumns("Male Height in Cm")
frame:AddSelectedColumns("Female Height in Cm")

Now it is time to create the Box Plot which can be done with the following code. This creates a chart object from our DataFrame component, frame. For the rest of this lesson, we will be using the chart object we have created to change and format the rest of our line chart.


// using the data frame, format data by creating a box plot chart component
BoxPlot chart = frame:BoxPlot()
chart:Display()

Example of loading our data and creating box plot object

Calling the Display() function will give us a pop-up of our formatted data so far. We still need to give meaning to our data, therefore, the following steps will show us how to label and customize our chart.

Labeling the Box Plot

In order for viewers to understand our data, labels give a clear comprehension of what is being presented. This means that we will be labeling the x axis, y axis, legend, and giving our chart a title that describes the dataset. To do so, we will call the following functions with our chart object: SetTitle(), SetXAxisTitle(), SetYAxisTitle(), SetLegendTitle(), and SetSubtitle(). Here is a brief description on what each function does and what it takes in.

Labeling Charts
Function Description Usage
SetTitle(text name)SetTitle() takes in a string as a parameter, which would be the title of the chart. For this example, we will name the chart "Height of Male and Female in the World"chart:SetTitle("Height of Male and Female in the World")
SetXAxisTitle(text name)SetXAxisTitle() takes in a string as a parameter, which would be the label of the x axis. For this example, we will label this section "Sex"chart:SetXAxisTitle("Sex"
SetYAxisTitle(text name)SetYAxisTitle() takes in a string as a parameter, which would be the label of the y axis. For this example, we will label this section "Height (cm)." chart:SetYAxisTitle("Height (cm)")
SetLegendTitle(text name)SetLegendTitle() takes in a string as a parameter, which would label the legend of the chart. The legend identifies the separate ages for the dots. For this example, we will label the legend "Average Height of a Human" chart:SetLegendTitle("Average Height of a Human")
SetSubtitle(text title)SetSubtitle() takes in a string as a parameter which would set a subtitle under the title. This can be any short description or any other necessary information for our chart. For this example, we will label the subtitle "What is the average height of the population by sex?"chart:SetSubtitle("What is the average height of the population by sex?")


// label your box plot
chart:SetXAxisTitle"Sex")
chart:SetYAxisTitle("Height (cm)")
chart:SetLegendTitle("Average Height of a Human")
chart:SetSubtitle("What is the average height of the population by sex?")
chart:SetTitle("Height of Male and Female in the World")

Note, if we would like to see the data chart so far, we can type "chart:Display()" to view it with the labels we created.

Example of labeling our box plot

Customizing the Data Chart

Now that we have our data labeled, we can customize our data to our liking, such as adjusting the intervals, changing starting values, and changing the color. We will be playing around with all these features and to do so, we will be again, using our chart object to call these functions. The functions we will be using for this would be: SetLegendLocation(text location), SetColorPaletteToDisurbing(), SetFontSize(integer size), and FlipOrientation(). Here are brief descriptions on what each function does and how to use it.

Customizing Charts
Function Description Usage
SetLegendLocation(text location)SetLegendLocation() takes in a string as a parameter, which would be the directions, left, right, top or bottom. These directions would place the legend in the specified place. For this example, we will place the legend on the "bottom"chart:SetLegendLocation("Legend Location")
SetFontSize(integer size)SetFontSize() takes in an integer as a parameter and will set the font size on all text based on the desired input. For this tutorial, we will insert 30 as the font size.chart:SetFontSize(20)
SetColorPaletteToDisurbing()SetColorPaletteToDisurbing() takes in no parameters, but will adjust the color palette based off of yellows, browns, oranges, and greenschart:SetColorPaletteToDisturbing()
FlipOrientation()FlipOrientation() takes in no parameters, and this function will swap the places of the x and y axis.chart:FlipOrientation()


// set the legend location, choices are left, right, top and bottom
chart:SetLegendLocation("bottom")

// color palette contains yellows, oranges, browns, and greens
chart:SetColorPaletteToDisturbing()

// adjust font size by preference, here we set it to 30 pt
chart:SetFontSize(20)

// if we would like to switch the x and y axis
chart:FlipOrientation()

Example of customizing our box plot

Congratulations, our Box Chart is constructed! Now we can display our chart with the Display() function. There are two ways to do this, letting it automatically display and specifying a specific window size. By doing chart:Display() it will display in a size equal to the screen size. By doing chart:Display(num, num), it will display the chart in a respected constraint window size. We will be using the default display.

chart:Display(1000, 750)

Now, feel free to clean, build, and run our program and we shortly should see a Game window pop-up. This is our Box Chart! To view the entire code, click here to view the file.

Full Example of the Box Plot

Final Chart

Further Useful Box Plot Functions

Extra Functions
Function Description Usage
HideOutliers()This function does not take any parameters and it will hide the outliers from the chart.chart:HideOutliers()

To view more examples with charts, we can reference the Quorum Curriculum Repository for charts.

Next Tutorial

In the next tutorial, we will discuss violin plot, which describes how to use the violin plot chart in quorum.