Learn About Calculating the Interquartile Range
This tutorial describes how to calculate the interquartile rangeCalculating the Interquartile Range
In data science, the Interquartile Range (IQR) is a measure of variability, based on dividing a data set into quartiles. Specifically, IQR will describe the spread of the middle half of the distribution. For example, say we have a box plot of four quartiles, the IQR would calculate the two middle quartiles, notably Q1 through Q3. The interquartile range provides us a spread between regions of the data.
The Interquartile Range is calculated by using a helper action within the DataFrame's class. We will calculate the Interquartile Range (IQR) for the 'Area' column. To do this, we will use our 'frame' object and call the function InterQuartileRange (). In this case we will be calculating the IQR of the area of dry bean classifications. Here is a brief description on how InterQuartileRange() works.
Function | Description | Usage |
---|---|---|
frame:InterQuartileRange() | This action sorts the following selected column and calculates the IQR of the column. Note that it can only calculate the IQR of one column at a time. | frame:InterQuartileRange() |
Here are other functions that we use in the example:
Function | Description | Usage |
---|---|---|
InterQuartileRange range = frame:InterQuartileRange() | Initializes the IQR class named range to the frame we created earlier | InterQuartileRange range |
range:GetMinimum() | Returns the minimum value that is not an outlier, that is inside the IQR range | output "Minimum: " + range:GetMinimum() |
range:GetMaximum() | Returns the maximum value that is not an outlier, that is inside the IQR range | output "Maximum: " + range:GetMaximum() |
range:GetInterQuartileRange() | Returns the range of the IQR calculated from a given dataset | output "Range: " + range:GetInterQuartileRange() |
Here is some code on how to calculate the IQR:
//We need the DataFrame class to load in files for Data Science operations.
use Libraries.Compute.Statistics.DataFrame
//This is the calculation for the arithmetic mean
use Libraries.Compute.Statistics.Calculations.InterQuartileRange
//Create a DataFrame, which is essentially a table that understands
//more information about the data that is being loaded.
//Using the default loader is enough for our purposes
DataFrame frame
frame:Load("../Data/Miscellaneous/DryBeans.csv")
//Tell the frame we want the first column selected
frame:AddSelectedColumn(0)
InterQuartileRange range = frame:InterQuartileRange()
output "Minimum: " + range:GetMinimum()
output "Maximum: " + range:GetMaximum()
output "Range: " + range:GetInterQuartileRange()
Try it Yourself!
Press the blue run button to execute the code in the code editor. Press the red stop button to end the program. Your program will work when the console outputs "Build Successful!"
Congrats! We have just learned how to calculate the Interquartile Range! To view the whole file, we can click here.
Next Tutorial
In the next tutorial, we will discuss standard deviation from the mean, which describes calculating the standard deviation from the mean (z-score).