Calculating the Median

In data statistics, the median is the middle number of a sorted (either ascending or descending) numerical dataset. Depending on context, a median may describe a dataset better than the average because it is not skewed by possible outliers in our datasets. The median basically is the separating factor between the upper and lower halves of a dataset.

median= n+1 2 th

The median is calculated using the helper action Median() which is within the DataFrame's class. We again calculate this for the survival column. To do this, we will use our 'frame' object and call the function Median(). In this case we will be calculating the median of the perimeter and the area of dry bean classifications. Here is a brief description on how Median() works.

Median Function
Function Description Usage
dataFrameObject:Median()This action sorts the following selected column and calculates the middle value of that column. Note that it can only calculate the median of one column at a time.frame:Median()

Here is some code on how to calculate the median:

//We need the DataFrame class to load in files for Data Science operations.
use Libraries.Compute.Statistics.DataFrame

//Create a DataFrame, which is essentially a table that understands 
//more information about the data that is being loaded.
DataFrame frame

//This loads data relative to the project, so put the dryBeans file in the Data/Miscellaneous folder

//we can also grab by the header name as well and calculate the mean/average
text someText1 = "Median of the column Minor Axis Length of dry beans:"
output someText1
output frame:Median()

Run the Example

Example of calculating the median

Congrats! We have just learned how to calulate the median! To view the whole file, we can click here.

Next Tutorial

In the next tutorial, we will discuss mode, which describes calculating the mode.