Calculating the Interquartile Range

In data science, the Interquartile Range (IQR) is a measure of variability, based on dividing a data set into quartiles. Specifically, IQR will describe the spread of the middle half of the distribution. For example, say we have a box plot of four quartiles, the IQR would calculate the two middle quartiles, notably Q1 through Q3. The interquartile range provides us a spread between regions of the data.

IQR= Q 3 Q 1

The Interquartile Range is calculated by using a helper action within the DataFrame's class. We will calculate the Interquartile Range (IQR) for the 'Area' column. To do this, we will use our 'frame' object and call the function InterQuartileRange (). In this case we will be calculating the IQR of the area of dry bean classifications. Here is a brief description on how InterQuartileRange() works.

IQR Function
Function Description Usage
frame:InterQuartileRange()This action sorts the following selected column and calculates the IQR of the column. Note that it can only calculate the IQR of one column at a time.frame:InterQuartileRange()

Here are other functions that we use in the example:

Helper Functions for IQR
Function Description Usage
InterQuartileRange range = frame:InterQuartileRange()Initializes the IQR class named range to the frame we created earlierInterQuartileRange range
range:GetMinimum()Returns the minimum value that is not an outlier, that is inside the IQR rangeoutput "Minimum: " + range:GetMinimum()
range:GetMaximum()Returns the maximum value that is not an outlier, that is inside the IQR rangeoutput "Maximum: " + range:GetMinimum()
range:GetInterQuartileRange()Returns the range of the IQR calculated from a given datasetoutput "Range: " + range:GetInterQuartileRange()

Here is some code on how to calculate the IQR:

//We need the DataFrame class to load in files for Data Science operations.
use Libraries.Compute.Statistics.DataFrame
//This is the calculation for the arithmetic mean
use Libraries.Compute.Statistics.Calculations.InterQuartileRange

//Create a DataFrame, which is essentially a table that understands 
//more information about the data that is being loaded.
//Using the default loader is enough for our purposes
DataFrame frame

    //Tell the frame we want the first column selected

InterQuartileRange range = frame:InterQuartileRange()
output "Minimum: " + range:GetMinimum()
output "Maximum: " + range:GetMaximum()
output "Range: " + range:GetInterQuartileRange()

Run the Example

Example of calculating the interquartile range

Congrats! We have just learned how to calculate the Interquartile Range! To view the whole file, we can click here.

Next Tutorial

In the next tutorial, we will discuss standard deviation from the mean, which describes calculating the standard deviation from the mean (z-score).