Calculating the Interquartile Range
In data science, the Interquartile Range (IQR) is a measure of variability, based on dividing a data set into quartiles. Specifically, IQR will describe the spread of the middle half of the distribution. For example, say we have a box plot of four quartiles, the IQR would calculate the two middle quartiles, notably Q1 through Q3. The interquartile range provides us a spread between regions of the data.
The Interquartile Range is calculated by using a helper action within the DataFrame's class. We will calculate the Interquartile Range (IQR) for the 'Area' column. To do this, we will use our 'frame' object and call the function InterQuartileRange (). In this case we will be calculating the IQR of the area of dry bean classifications. Here is a brief description on how InterQuartileRange() works.
|frame:InterQuartileRange()||This action sorts the following selected column and calculates the IQR of the column. Note that it can only calculate the IQR of one column at a time.||frame:InterQuartileRange()|
Here are other functions that we use in the example:
|InterQuartileRange range = frame:InterQuartileRange()||Initializes the IQR class named range to the frame we created earlier||InterQuartileRange range|
|range:GetMinimum()||Returns the minimum value that is not an outlier, that is inside the IQR range||output "Minimum: " + range:GetMinimum()|
|range:GetMaximum()||Returns the maximum value that is not an outlier, that is inside the IQR range||output "Maximum: " + range:GetMinimum()|
|range:GetInterQuartileRange()||Returns the range of the IQR calculated from a given dataset||output "Range: " + range:GetInterQuartileRange()|
Here is some code on how to calculate the IQR:
//We need the DataFrame class to load in files for Data Science operations. use Libraries.Compute.Statistics.DataFrame //This is the calculation for the arithmetic mean use Libraries.Compute.Statistics.Calculations.InterQuartileRange //Create a DataFrame, which is essentially a table that understands //more information about the data that is being loaded. //Using the default loader is enough for our purposes DataFrame frame frame:Load("../Data/Miscellaneous/DryBeans.csv") //Tell the frame we want the first column selected frame:AddSelectedColumn(0) InterQuartileRange range = frame:InterQuartileRange() output "Minimum: " + range:GetMinimum() output "Maximum: " + range:GetMaximum() output "Range: " + range:GetInterQuartileRange()
Run the Example
Example of calculating the interquartile range
Congrats! We have just learned how to calculate the Interquartile Range! To view the whole file, we can click here.
In the next tutorial, we will discuss standard deviation from the mean, which describes calculating the standard deviation from the mean (z-score).