Learn About Calculating the Interquartile Range

This tutorial describes how to calculate the interquartile range

Calculating the Interquartile Range

In data science, the Interquartile Range (IQR) is a measure of variability, based on dividing a data set into quartiles. Specifically, IQR will describe the spread of the middle half of the distribution. For example, say we have a box plot of four quartiles, the IQR would calculate the two middle quartiles, notably Q1 through Q3. The interquartile range provides us a spread between regions of the data.

IQR= Q 3 Q 1

The Interquartile Range is calculated by using a helper action within the DataFrame's class. We will calculate the Interquartile Range (IQR) for the 'Area' column. To do this, we will use our 'frame' object and call the function InterQuartileRange (). In this case we will be calculating the IQR of the area of dry bean classifications. Here is a brief description on how InterQuartileRange() works.

IQR Function
frame:InterQuartileRange()This action sorts the following selected column and calculates the IQR of the column. Note that it can only calculate the IQR of one column at a time.frame:InterQuartileRange()

Here are other functions that we use in the example:

Helper Functions for IQR
InterQuartileRange range = frame:InterQuartileRange()Initializes the IQR class named range to the frame we created earlierInterQuartileRange range
range:GetMinimum()Returns the minimum value that is not an outlier, that is inside the IQR rangeoutput "Minimum: " + range:GetMinimum()
range:GetMaximum()Returns the maximum value that is not an outlier, that is inside the IQR rangeoutput "Maximum: " + range:GetMaximum()
range:GetInterQuartileRange()Returns the range of the IQR calculated from a given datasetoutput "Range: " + range:GetInterQuartileRange()

Here is some code on how to calculate the IQR:

//We need the DataFrame class to load in files for Data Science operations.
use Libraries.Compute.Statistics.DataFrame
//This is the calculation for the arithmetic mean
use Libraries.Compute.Statistics.Calculations.InterQuartileRange

//Create a DataFrame, which is essentially a table that understands 
//more information about the data that is being loaded.
//Using the default loader is enough for our purposes
DataFrame frame

    //Tell the frame we want the first column selected

InterQuartileRange range = frame:InterQuartileRange()
output "Minimum: " + range:GetMinimum()
output "Maximum: " + range:GetMaximum()
output "Range: " + range:GetInterQuartileRange()

Try it Yourself!

Press the blue run button to execute the code in the code editor. Press the red stop button to end the program. Your program will work when the console outputs "Build Successful!"

Congrats! We have just learned how to calculate the Interquartile Range! To view the whole file, we can click here.

Next Tutorial

In the next tutorial, we will discuss standard deviation from the mean, which describes calculating the standard deviation from the mean (z-score).