Libraries.Compute.Statistics.DataFrame Documentation

The DataFrame class is a collection of columns and rows, like a spreadsheet, that can be used for statistics and other calculations. By default, it can load comma separated files. Other file types can be supported using the Load action with a file loader for the custom type. DataFrame objects can also be transformed using the Transform action, which is useful for sorting, filtering, or other operations. Transforms generally make a copy of the data frame and act on that copy, not the original.

Example Code

use Libraries.Compute.Statistics.DataFrame

//Load a comma separated file
DataFrame frame
frame:Load("Data.csv")

Inherits from: Libraries.Language.Object

Actions Documentation

Add(Libraries.Compute.Statistics.DataFrameSelectionListener listener)

Classes can register as listeners of the selection in the DataFrame.

Parameters

AddColumn(integer index, Libraries.Compute.Statistics.DataFrameColumn column)

This action adds a column to the data frame. It is destructive in that it changes the existing DataFrame without making a copy.

Parameters

Example

//We need the DataFrame class to load in files for Data Science operations.
use Libraries.Compute.Statistics.DataFrame
use Libraries.Compute.Statistics.Columns.NumberColumn
use Libraries.Containers.Array
use Libraries.Compute.Statistics.DataFrameColumn

//Create a DataFrame, which is essentially a table that understands 
//more information about the data that is being loaded.
DataFrame frame

//This creates a NumberColumn, which contains numbers
NumberColumn column
column:SetHeader("My Column")
column:Add(1)
column:Add(2)
column:Add(3)
column:Add(4)
column:Add(5)
column:Add(6)
frame:AddColumn(0, column)

//The system loaded the file, but can also output it a text value, or the console, if we want that.
output frame:ToText()

AddColumn(Libraries.Compute.Statistics.DataFrameColumn column)

This action adds a column to the data frame. It is destructive in that it changes the existing DataFrame without making a copy.

Parameters

Example

//We need the DataFrame class to load in files for Data Science operations.
use Libraries.Compute.Statistics.DataFrame
use Libraries.Compute.Statistics.Columns.NumberColumn
use Libraries.Containers.Array
use Libraries.Compute.Statistics.DataFrameColumn

//Create a DataFrame, which is essentially a table that understands 
//more information about the data that is being loaded.
DataFrame frame

//This creates a NumberColumn, which contains numbers
NumberColumn column
column:SetHeader("My Column")
column:Add(1)
column:Add(2)
column:Add(3)
column:Add(4)
column:Add(5)
column:Add(6)
frame:AddColumn(column)

//The system loaded the file, but can also output it a text value, or the console, if we want that.
output frame:ToText()

AddColumn(text column, text source)

This action takes a Quorum expression text value and then creates a new column the DataFrame. The expression follows the normal rules for Quorum, using the DataFrame's columns as the allowable variables. For example, if a DataFrame has a column, Group, and is an integer, then a value like Group * 2 would take the value of Group, multiply it by 2, and then do that for each row. If a row is an invalid type, an undefined value is placed at that position. The AddColumn(text, text) call is not destructive, meaning it adds to the DataFrame, but does not change the original data.

Parameters

  • text column
  • text source

Example

use Libraries.Compute.Statistics.DataFrame
DataFrame frame
frame:Load("file.csv")
frame:AddColumn("Group * 3")
output frame:ToText()

AddColumnOnLoad(integer index, Libraries.Compute.Statistics.DataFrameColumn column)

This action adds a column that, when the DataFrame is loaded will be used for processing a particular column. This will allow the loader to use customized type information specific to a particular file or situation.

Parameters

  • integer index: the position of the index on loading. For example, an index of means the column at index 0, if one is loaded.
  • Libraries.Compute.Statistics.DataFrameColumn: the DataFrameColumn to use and enter into the DataFrame.

Example

use Libraries.Compute.Statistics.DataFrame

DataFrame frame
NumberColumn column
frame:AddColumnOnLoad(0, column)

frame:Load("Data/Sheet.csv")
output frame:ToText()

AddSelectedCell(Libraries.Containers.Support.Pair<integer> cell)

This adds a row to the selected range.

Parameters

Example

use Libraries.Compute.Statistics.DataFrame
use Libraries.Containers.Support.Pair

DataFrame frame
frame:Load("Data.csv")
Pair<integer> cell
cell:Set(0,0)
frame:AddSelectedCell(cell)
output frame:ToText()

AddSelectedCell(integer x, integer y)

This adds a row to the selected range.

Parameters

  • integer x: the x coordinate of the cell to add
  • integer y: the y coordinate of the cell to add

Example

use Libraries.Compute.Statistics.DataFrame
    
DataFrame frame
frame:Load("Data.csv")
frame:AddSelectedCell(0,0)
output frame:ToText()

AddSelectedColumn(integer index)

This adds a column to the selected range.

Parameters

  • integer index: the column index of the column to add

Example

use Libraries.Compute.Statistics.DataFrame
    
DataFrame frame
frame:Load("Data.csv")
frame:AddSelectedColumn(0)
output frame:ToText()

AddSelectedColumnRange(integer start, integer finish)

This adds adds columns to the selected range, starting from start and ending at finish, inclusive. In this case, this means that calculations will be conducted across this entire range.

Parameters

  • integer start: the start of the range
  • integer finish: the end of the range

Example

use Libraries.Compute.Statistics.DataFrame
    
DataFrame frame
frame:Load("Data.csv")
frame:AddSelectedColumnRange(0, 2)
output frame:ToText()

AddSelectedColumns(text headers)

This action reads a comma separated list of header names and determines the indices from this list. This action is inherently strict, where if the parsing fails, the headers are not unique, or there are other issues in the list, this action throws an error.

Parameters

  • text headers: the columns to select

Example

use Libraries.Compute.Statistics.DataFrame
    
DataFrame frame
frame:Load("Data.csv")
frame:AddSelectedColumns("name1,name2")
output frame:GetSelectedColumnSize()

AddSelectedFactor(integer index)

This adds a factor of a particular index anywhere from the selection.

Parameters

  • integer index: the index of the factor to add

Example

use Libraries.Compute.Statistics.DataFrame
    
DataFrame frame
frame:Load("Data.csv")
frame:AddSelectedFactor(0)
output frame:ToText()

AddSelectedFactorRange(integer start, integer finish)

This adds adds factors to the selected range, starting from start and ending at finish, inclusive. In this case, this means that calculations will be conducted across this entire range.

Parameters

  • integer start: the start of the range
  • integer finish: the end of the range

Example

use Libraries.Compute.Statistics.DataFrame
    
DataFrame frame
frame:Load("Data.csv")
frame:AddSelectedFactorRange(0, 2)
output frame:ToText()

AddSelectedFactors(text headers)

This action reads a comma separated list of header names and determines the indices from this list. This action is inherently strict, where if the parsing fails, the headers are not unique, or there are other issues in the list, this action throws an error.

Parameters

  • text headers: the columns to select

Example

use Libraries.Compute.Statistics.DataFrame
    
DataFrame frame
frame:Load("Data.csv")
frame:AddSelectedFactors("name1,name2")
output frame:GetSelectedColumnSize()

AddSelectedRow(integer index)

This adds a row to the selected range.

Parameters

  • integer index: the row index of the row to add

Example

use Libraries.Compute.Statistics.DataFrame
    
DataFrame frame
frame:Load("Data.csv")
frame:AddSelectedRow(0)
output frame:ToText()

BarChart()

This action creates a BarChart from the current column selection in the DataFrame. By default, it uses the first column in the selection as the x-axis and the second column as the y-axis. This can be reversed by changing the selection order.

Return

Libraries.Interface.Controls.Charts.BarChart: a BarChart chart that can be displayed or placed into a user interface or game.

Example

use Libraries.Compute.Statistics.DataFrame
use Libraries.Interface.Controls.Charts.BarChart

DataFrame frame
frame:Load("Data.csv")
frame:SetSelectedColumnRange(0,1)
BarChart chart = frame:BarChart()
chart:SetTitle("My Awesome Title")
chart:SetXAxisTitle("Time")
chart:Display()

BarChartByColumn()

This action creates a BarChart from the sum of values of selected columns grouped by the selected factors.

Return

Libraries.Interface.Controls.Charts.BarChart: a BarChart chart that can be displayed or placed into a user interface or game.

Example

use Libraries.Compute.Statistics.DataFrame
use Libraries.Interface.Controls.Charts.BarChart

DataFrame frame
frame:Load("Data.csv")
frame:AddSelectedColumn(0)
frame:AddSelectedFactor(3)

BarChart chart = frame:BarChartByColumnSum()
chart:Display()

BarChartByColumnMaximum()

This action creates a BarChart from the max of values of selected columns grouped by the selected factors.

Return

Libraries.Interface.Controls.Charts.BarChart: a BarChart chart that can be displayed or placed into a user interface or game.

Example

use Libraries.Compute.Statistics.DataFrame
use Libraries.Interface.Controls.Charts.BarChart

DataFrame frame
frame:Load("Data.csv")
frame:AddSelectedColumn(0)
frame:AddSelectedFactor(3)

BarChart chart = frame:BarChartByColumnMaximum()
chart:Display()

BarChartByColumnMean()

This action creates a BarChart from the mean of values of selected columns grouped by the selected factors.

Return

Libraries.Interface.Controls.Charts.BarChart: a BarChart chart that can be displayed or placed into a user interface or game.

Example

use Libraries.Compute.Statistics.DataFrame
use Libraries.Interface.Controls.Charts.BarChart

DataFrame frame
frame:Load("Data.csv")
frame:AddSelectedColumn(0)
frame:AddSelectedFactor(3)

BarChart chart = frame:BarChartByColumnMean()
chart:Display()

BarChartByColumnMinimum()

This action creates a BarChart from the min of values of selected columns grouped by the selected factors.

Return

Libraries.Interface.Controls.Charts.BarChart: a BarChart chart that can be displayed or placed into a user interface or game.

Example

use Libraries.Compute.Statistics.DataFrame
use Libraries.Interface.Controls.Charts.BarChart

DataFrame frame
frame:Load("Data.csv")
frame:AddSelectedColumn(0)
frame:AddSelectedFactor(3)

BarChart chart = frame:BarChartByColumnMinimum()
chart:Display()

BarChartByColumnSum()

This action creates a BarChart from the sum of values of selected columns grouped by the selected factors.

Return

Libraries.Interface.Controls.Charts.BarChart: a BarChart chart that can be displayed or placed into a user interface or game.

Example

use Libraries.Compute.Statistics.DataFrame
use Libraries.Interface.Controls.Charts.BarChart

DataFrame frame
frame:Load("Data.csv")
frame:AddSelectedColumn(0)
frame:AddSelectedFactor(3)

BarChart chart = frame:BarChartByColumnSum()
chart:Display()

BoxPlot()

This action creates a BoxPlot from the current column selection in the DataFrame. By default, it uses all columns as separate values in the selection as the chart area. Multiple columns will result in multiple plots of different colors labeled along the the x-axis. If a factor is given, the plots will be grouped based by that factor.

Return

Libraries.Interface.Controls.Charts.BoxPlot: a chart that can be displayed or placed into a user interface or game.

Example

use Libraries.Compute.Statistics.DataFrame
use Libraries.Interface.Controls.Charts.BoxPlot

DataFrame frame
frame:Load("Data.csv")
frame:AddSelectedColumn(1)
BoxPlot chart = frame:BoxPlot()
chart:SetTitle("My Awesome Title")
chart:SetXAxisTitle("Time")
chart:Display()

BoxPlotByColumn()

This action creates a BoxPlot from the of values of selected columns grouped by the selected factors.

Return

Libraries.Interface.Controls.Charts.BoxPlot: a BoxPlot chart that can be displayed or placed into a user interface or game.

Example

use Libraries.Compute.Statistics.DataFrame
use Libraries.Interface.Controls.Charts.BoxPlot

DataFrame frame
frame:Load("Data.csv")
frame:AddSelectedColumn(0)
frame:AddSelectedFactor(3)

BoxPlot chart = frame:BoxPlotByColumn()
chart:Display()

Calculate(Libraries.Compute.Statistics.DataFrameCalculation calculation)

This action runs a calculation on the data frame. Calculations are not intended to be destructive to the original data.

Parameters

Example

use Libraries.Compute.Statistics.DataFrame
use Libraries.System.File

//Load a comma separated file
DataFrame frame
File file
file:SetPath("Data.csv")
frame:Load(file)

CalculateColumn(text source)

This action takes a Quorum expression text value and then creates a new column without adding it to the DataFrame. The expression follows the normal rules for Quorum, using the DataFrame's columns as the allowable variables. For example, if a DataFrame has a column, Group, and is an integer, then a value like Group * 2 would take the value of Group, multiply it by 2, and then do that for each row. If a row is an invalid type, an undefined value is placed at that position. The CalculateColumn(text) call is not destructive, meaning it does not change the original frame.

Parameters

  • text source

Return

Libraries.Compute.Statistics.DataFrameColumn:

Example

use Libraries.Compute.Statistics.DataFrame
DataFrame frame
frame:Load("file.csv")
DataFrameColumn col = frame:CalculateColumn("Group * 3")
output col:ToText()

CalculateMaximumRows()

This action calculates the total number of rows in the data frame. To do this, it traverses the columns, finds the column with the max row count, and returns that integer.

Return

integer: the row count of the column with the largest number of rows

Example

use Libraries.Compute.Statistics.DataFrame
DataFrame frame
frame:Load("file.csv")
output frame:CalculateMaximumRows()

CheckReducibility()

Check that at least some of the variables have significant correlation, a prerequisite for factor analysis. The CheckReducibility object returned gives information back in several formats, including text formatted in the American Psychological Association (APA) style. This action runs a test based on how many columns are selected: 2+ Bartlett’s Test of Sphericity

Return

Libraries.Compute.Statistics.Tests.CheckReducibility: an object representing the test

Example

use Libraries.Compute.Statistics.DataFrame
use Libraries.Compute.Statistics.Tests.CheckReducibility
    
DataFrame frame
frame:Load("Data.csv")
frame:AddSelectedColumnRange(0,4)
CheckReducibility test = frame:CheckReducibility()
output test:GetFormalSummary()

CheckReducibilityStrength()

Measures sampling adequacy for each variable in the model and for the complete model, a prerequisite for factor analysis to work. The CheckReducibilityStrength object returned gives information back in several formats, including text formatted in the American Psychological Association (APA) style. This action runs a test based on how many columns are selected: 2+ Kaiser-Meyer-Olkin Measure Of Sampling Adequacy

Return

Libraries.Compute.Statistics.Tests.CheckReducibilityStrength: an object representing the test

Example

use Libraries.Compute.Statistics.DataFrame
use Libraries.Compute.Statistics.Tests.CheckReducibilityStrength
    
DataFrame frame
frame:Load("Data.csv")
frame:AddSelectedColumnRange(0,4)
CheckReducibilityStrength test = frame:CheckReducibilityStrength()
output test:GetFormalSummary()

Compare(Libraries.Language.Object object)

This action compares two object hash codes and returns an integer. The result is larger if this hash code is larger than the object passed as a parameter, smaller, or equal. In this case, -1 means smaller, 0 means equal, and 1 means larger. This action was changed in Quorum 7 to return an integer, instead of a CompareResult object, because the previous implementation was causing efficiency issues.

Parameters

Return

integer: The Compare result, Smaller, Equal, or Larger.

Example

Object o
Object t
integer result = o:Compare(t) //1 (larger), 0 (equal), or -1 (smaller)

CompareCounts()

This action uses the selection to conduct a count comparison between one or more columns. The CompareCounts object returned gives information back in several formats, including text formatted in the American Psychological Association (APA) style. This action runs a test based on how many columns are selected: 1 Chi-Squared Goodness Of Fit vs uniform expected counts 2 Chi-Squared Test Of Independence 3+ Pairwise Chi-Squared Test Of Independence

Return

Libraries.Compute.Statistics.Tests.CompareCounts: an object representing the comparison

Example

use Libraries.Compute.Statistics.DataFrame
use Libraries.Compute.Statistics.Tests.CompareCounts
    
DataFrame frame
frame:Load("Data.csv")
frame:AddSelectedColumnRange(0,2)
CompareCounts compare = frame:CompareCounts()
output compare:GetSummary()

CompareDistributionsToNormal()

This action uses the selection to conduct a comparison between a column distribution and a normal distribution. The CompareDistributions object returned gives information back in several formats, including text formatted in the American Psychological Association (APA) style. This action runs a test based on how many columns are selected: 1 Shapiro-Wilk Normality Test

Return

Libraries.Compute.Statistics.Tests.CompareDistributions: an object representing the comparison

Example

use Libraries.Compute.Statistics.DataFrame
use Libraries.Compute.Statistics.Tests.CompareDistributions
    
DataFrame frame
frame:Load("Data.csv")
frame:AddSelectedColumnRange(0,2)
CompareDistributions compare = frame:CompareDistributionsToNormal()
output compare:GetFormalSummary()

CompareMeans(number mean)

This action uses the selection to conduct a comparison between one or two columns and a mean. The CompareMeans object returned gives information back in several formats, including text formatted in the American Psychological Association (APA) style. This action runs a test based on how many columns are selected: 1 One-Sample T-Test vs given-mean

Parameters

  • number mean

Return

Libraries.Compute.Statistics.Tests.CompareMeans: an object representing the comparison

Example

use Libraries.Compute.Statistics.DataFrame
use Libraries.Compute.Statistics.Tests.CompareMeans
    
DataFrame frame
frame:Load("Data.csv")
frame:AddSelectedColumn(0)
CompareMeans compare = frame:CompareMeans(0)
output compare:GetFormalSummary()

CompareMeans(Libraries.Compute.Statistics.Tests.ExperimentalDesign design)

This action uses an experimental design to pick and conduct the appropriate CompareMeans test.

Parameters

Return

Libraries.Compute.Statistics.Tests.CompareMeans:

Example

use Libraries.Compute.Statistics.DataFrame
use Libraries.Compute.Statistics.Tests.CompareMeans
use Libraries.Compute.Statistics.Tests.ExperimentalDesign
    
ExperimentalDesign design
design:AddBetweenSubjectsFactor("Age")
design:AddBetweenSubjectsFactor("Group")
design:AddDependentVariable("Value")

DataFrame frame
frame:Load("Data.csv")
CompareMeans compare = frame:CompareMeans(design)
output compare:GetFormalSummary()

CompareMeans()

This action uses the selection to conduct a comparison between two or more independent columns. The CompareMeans object returned gives information back in several formats, including text formatted in the American Psychological Association (APA) style. This action runs a test based on how many columns are selected: 1 One-Sample T-Test vs zero-mean 2 Two-Sample T-Test 3+ Anova/Manova

Return

Libraries.Compute.Statistics.Tests.CompareMeans: an object representing the comparison

Example

use Libraries.Compute.Statistics.DataFrame
use Libraries.Compute.Statistics.Tests.CompareMeans
    
DataFrame frame
frame:Load("Data.csv")
frame:AddSelectedColumnRange(0,2)
CompareMeans compare = frame:CompareMeans()
output compare:GetFormalSummary()

CompareMeansAssumingEqualVariances(boolean equalVariances)

This action uses the selection to conduct a comparison between two or more independent columns. The CompareMeans object returned gives information back in several formats, including text formatted in the American Psychological Association (APA) style. This action runs a test based on how many columns are selected: 2 Two-Sample T-Test 3+ Anova/Manova

Parameters

  • boolean equalVariances

Return

Libraries.Compute.Statistics.Tests.CompareMeans: an object representing the comparison

Example

use Libraries.Compute.Statistics.DataFrame
use Libraries.Compute.Statistics.Tests.CompareMeans
    
DataFrame frame
frame:Load("Data.csv")
frame:AddSelectedColumnRange(0,2)
CompareMeans compare = frame:CompareMeansAssumingEqualVariances()
output compare:GetFormalSummary()

CompareMeansPairwise()

This action uses the selection to conduct a comparison between groups. The CompareMeansPairwise object returned gives information back in several formats, including text formatted in the American Psychological Association (APA) style. This action runs a test based on how many columns are selected: 2+ Pairwise Two Sample Test (with corrections)

Return

Libraries.Compute.Statistics.Tests.CompareMeansPairwise: an object representing the comparison

Example

use Libraries.Compute.Statistics.DataFrame
use Libraries.Compute.Statistics.Tests.CompareMeansPairwise
    
DataFrame frame
frame:Load("Data.csv")
frame:AddSelectedColumnRange(0,4)
CompareMeansPairwise compare = frame:CompareMeansPairwise()
output compare:GetFormalSummary()

CompareMeansPairwiseUsingLenientCorrection()

This action uses the selection to conduct a liberal comparison between groups. The CompareMeansPairwise object returned gives information back in several formats, including text formatted in the American Psychological Association (APA) style. This action runs a test based on how many columns are selected: 2+ Pairwise Two Sample Test (with corrections)

Return

Libraries.Compute.Statistics.Tests.CompareMeansPairwise: an object representing the comparison

Example

use Libraries.Compute.Statistics.DataFrame
use Libraries.Compute.Statistics.Tests.CompareMeansPairwise
    
DataFrame frame
frame:Load("Data.csv")
frame:AddSelectedColumnRange(0,4)
CompareMeansPairwise compare = frame:CompareMeansPairwiseUsingLenientCorrection()
output compare:GetFormalSummary()

CompareMeansPairwiseUsingStrictCorrection()

This action uses the selection to conduct a liberal comparison between groups. The CompareMeansPairwise object returned gives information back in several formats, including text formatted in the American Psychological Association (APA) style. This action runs a test based on how many columns are selected: 2+ Pairwise Two Sample Test (with corrections)

Return

Libraries.Compute.Statistics.Tests.CompareMeansPairwise: an object representing the comparison

Example

use Libraries.Compute.Statistics.DataFrame
use Libraries.Compute.Statistics.Tests.CompareMeansPairwise
    
DataFrame frame
frame:Load("Data.csv")
frame:AddSelectedColumnRange(0,4)
CompareMeansPairwise compare = frame:CompareMeansPairwiseUsingStrictCorrection()
output compare:GetFormalSummary()

CompareRankedMeans(number median)

This action uses the selection to conduct a comparison between a ranked column and a median. The CompareMeans object returned gives information back in several formats, including text formatted in the American Psychological Association (APA) style. This action runs a test based on how many columns are selected: 1 One-Sample Wilcoxon Signed-Ranks Test vs given-median

Parameters

  • number median

Return

Libraries.Compute.Statistics.Tests.CompareMeans: an object representing the comparison

Example

use Libraries.Compute.Statistics.DataFrame
use Libraries.Compute.Statistics.Tests.CompareMeans
    
DataFrame frame
frame:Load("Data.csv")
frame:AddSelectedColumn(0)
CompareMeans compare = frame:CompareRankedMeanTo(0)
output compare:GetFormalSummary()

CompareRankedMeans()

This action uses the selection to conduct a comparison between two or more ranked columns. The CompareMeans object returned gives information back in several formats, including text formatted in the American Psychological Association (APA) style. This action runs a test based on how many columns are selected: 1 One-Sample Wilcoxon Signed-Ranks Test vs zero-median 2 Mann-Whiteney U-Test (Wilcoxon Rank-Sum Test) 3+ Kruskal-Wallis H Test

Return

Libraries.Compute.Statistics.Tests.CompareMeans: an object representing the comparison

Example

use Libraries.Compute.Statistics.DataFrame
use Libraries.Compute.Statistics.Tests.CompareMeans
    
DataFrame frame
frame:Load("Data.csv")
frame:AddSelectedColumnRange(0,2)
CompareMeans compare = frame:CompareRankedMeans()
output compare:GetFormalSummary()

CompareRankedMeansPairwise()

This action uses the selection to conduct a comparison between groups when they are not assumed to be normally distributed. The CompareMeansPairwise object returned gives information back in several formats, including text formatted in the American Psychological Association (APA) style. This action runs a test based on how many columns are selected: 2+ Pairwise Ranked Two Sample Test (with corrections)

Return

Libraries.Compute.Statistics.Tests.CompareMeansPairwise: an object representing the comparison

Example

use Libraries.Compute.Statistics.DataFrame
use Libraries.Compute.Statistics.Tests.CompareMeansPairwise
    
DataFrame frame
frame:Load("Data.csv")
frame:AddSelectedColumnRange(0,4)
CompareMeansPairwise compare = frame:CompareRankedGroupsPairwise()
output compare:GetFormalSummary()

CompareRelatedMeans()

This action uses the selection to conduct a comparison between two or more related columns. The CompareMeans object returned gives information back in several formats, including text formatted in the American Psychological Association (APA) style. This action runs a test based on how many columns are selected: 1 One-Sample T-Test vs zero-mean 2 Paired T-Test vs zero-mean 3+ Repeated Measures Anova

Return

Libraries.Compute.Statistics.Tests.CompareMeans: an object representing the comparison

Example

use Libraries.Compute.Statistics.DataFrame
use Libraries.Compute.Statistics.Tests.CompareMeans
    
DataFrame frame
frame:Load("Data.csv")
frame:AddSelectedColumnRange(0,2)
CompareMeans compare = frame:CompareRelatedMeans()
output compare:GetFormalSummary()

CompareRelatedMeans(number mean)

This action uses the selection to conduct a comparison between one or two columns and a mean. The CompareMeans object returned gives information back in several formats, including text formatted in the American Psychological Association (APA) style. This action runs a test based on how many columns are selected: 1 Paired T-Test vs given-mean

Parameters

  • number mean

Return

Libraries.Compute.Statistics.Tests.CompareMeans: an object representing the comparison

Example

use Libraries.Compute.Statistics.DataFrame
use Libraries.Compute.Statistics.Tests.CompareMeans
    
DataFrame frame
frame:Load("Data.csv")
frame:AddSelectedColumn(0)
frame:AddSelectedColumn(1)
CompareMeans compare = frame:CompareRelatedMeans(0)
output compare:GetFormalSummary()

CompareRelatedMeansPairwise()

This action uses the selection to conduct a comparison between groups. The CompareMeansPairwise object returned gives information back in several formats, including text formatted in the American Psychological Association (APA) style. This action runs a test based on how many columns are selected: 2+ Pairwise Paired Two Sample Test (with corrections)

Return

Libraries.Compute.Statistics.Tests.CompareMeansPairwise: an object representing the comparison

Example

use Libraries.Compute.Statistics.DataFrame
use Libraries.Compute.Statistics.Tests.CompareMeansPairwise
    
DataFrame frame
frame:Load("Data.csv")
frame:AddSelectedColumnRange(0,4)
CompareMeansPairwise compare = frame:CompareRelatedMeansPairwise()
output compare:GetFormalSummary()

CompareRelatedRankedMeans(number median)

This action uses the selection to conduct a comparison between a ranked column and a median. The CompareMeans object returned gives information back in several formats, including text formatted in the American Psychological Association (APA) style. This action runs a test based on how many columns are selected: 1 Paired Wilcoxon Signed-Ranks Test vs given-median

Parameters

  • number median

Return

Libraries.Compute.Statistics.Tests.CompareMeans: an object representing the comparison

Example

use Libraries.Compute.Statistics.DataFrame
use Libraries.Compute.Statistics.Tests.CompareMeans
    
DataFrame frame
frame:Load("Data.csv")
frame:AddSelectedColumn(0)
CompareMeans compare = frame:CompareRankedMeanTo(0)
output compare:GetFormalSummary()

CompareRelatedRankedMeans()

This action uses the selection to conduct a comparison between two or more ranked related columns. The CompareMeans object returned gives information back in several formats, including text formatted in the American Psychological Association (APA) style. This action runs a test based on how many columns are selected: 1 One-Sample Wilcoxon Signed-Ranks Test vs zero-median 2 Wilcoxon Signed-Ranks Test vs zero-median 3+ Friedman Test

Return

Libraries.Compute.Statistics.Tests.CompareMeans: an object representing the comparison

Example

use Libraries.Compute.Statistics.DataFrame
use Libraries.Compute.Statistics.Tests.CompareMeans
    
DataFrame frame
frame:Load("Data.csv")
frame:AddSelectedColumnRange(0,2)
CompareMeans compare = frame:CompareRelatedRankedMeans()
output compare:GetFormalSummary()

CompareRelatedRankedMeansPairwise()

This action uses the selection to conduct a comparison between groups. The CompareMeansPairwise object returned gives information back in several formats, including text formatted in the American Psychological Association (APA) style. This action runs a test based on how many columns are selected: 2+ Pairwise Paired Ranked Two Sample Test (with corrections)

Return

Libraries.Compute.Statistics.Tests.CompareMeansPairwise: an object representing the comparison

Example

use Libraries.Compute.Statistics.DataFrame
use Libraries.Compute.Statistics.Tests.CompareMeansPairwise
    
DataFrame frame
frame:Load("Data.csv")
frame:AddSelectedColumnRange(0,4)
CompareMeansPairwise compare = frame:CompareRelatedRankedMeansPairwise()
output compare:GetFormalSummary()

CompareRelatedVariances()

This action uses the selection to conduct a variance comparison between two or more related columns. The CompareVariances object returned gives information back in several formats, including text formatted in the American Psychological Association (APA) style. This action runs a test based on how many columns are selected: 2+ Mauchly's Test

Return

Libraries.Compute.Statistics.Tests.CompareVariances: an object representing the comparison

Example

use Libraries.Compute.Statistics.DataFrame
use Libraries.Compute.Statistics.Tests.CompareVariances
    
DataFrame frame
frame:Load("Data.csv")
frame:AddSelectedColumnRange(0,2)
CompareVariances compare = frame:CompareRelatedVariances()
output compare:GetFormalSummary()

CompareVariances()

This action uses the selection to conduct a variance comparison between two or more independent columns. This action will use the center as The CompareVariances object returned gives information back in several formats, including text formatted in the American Psychological Association (APA) style. This action runs a test based on how many columns are selected: 2+ Levene's Test

Return

Libraries.Compute.Statistics.Tests.CompareVariances: an object representing the comparison

Example

use Libraries.Compute.Statistics.DataFrame
use Libraries.Compute.Statistics.Tests.CompareVariances
    
DataFrame frame
frame:Load("Data.csv")
frame:AddSelectedColumnRange(0,2)
CompareVariances compare = frame:CompareVariances()
output compare:GetFormalSummary()

CompareVariances(Libraries.Compute.Statistics.Tests.ExperimentalDesign design)

This action uses an experimental design to pick and conduct the appropriate CompareVariances test.

Parameters

Return

Libraries.Compute.Statistics.Tests.CompareVariances:

Example

use Libraries.Compute.Statistics.DataFrame
use Libraries.Compute.Statistics.Tests.CompareVariances
use Libraries.Compute.Statistics.Tests.ExperimentalDesign
    
ExperimentalDesign design
design:AddSubjectIdentifier("id")
design:AddWithinSubjectsFactor("Time")
design:AddBetweenSubjectsFactor("Group")
design:AddDependentVariable("Value")

DataFrame frame
frame:Load("Data.csv")
CompareVariances compare = frame:CompareVariances(design)
output compare:GetFormalSummary()

CompareVariancesUsingMean()

This action uses the selection to conduct a variance comparison between two or more independent columns. The CompareVariances object returned gives information back in several formats, including text formatted in the American Psychological Association (APA) style. This action runs a test based on how many columns are selected: 2+ Levene's Test

Return

Libraries.Compute.Statistics.Tests.CompareVariances: an object representing the comparison

Example

use Libraries.Compute.Statistics.DataFrame
use Libraries.Compute.Statistics.Tests.CompareVariances
    
DataFrame frame
frame:Load("Data.csv")
frame:AddSelectedColumnRange(0,2)
CompareVariances compare = frame:CompareVariancesUsingMean()
output compare:GetFormalSummary()

ConvertToMatrix()

This action takes a DataFrame and converts it into a matrix with number values. If the data frame contains columns that cannot be converted to numbers, this action throws an exception. Finally, all columns must have the same size for this conversion to work.

Return

Libraries.Compute.Matrix: A matrix of real number values

Example

use Libraries.Compute.Statistics.DataFrame
use Libraries.Compute.Matrix

//Load a comma separated file
DataFrame frame
frame:Load("Data.csv")

//reverse the data frame
Matrix matrix = frame:ConvertToMatrix()
output matrix:ToText()

Copy(text source)

This action takes a Quorum expression text value and then filters the DataFrame on a copy. The expression follows the normal rules for Quorum, using the DataFrame's columns as the allowable variables. For example, if a DataFrame has a column, Group, and is an integer, then a value like Group = 2 would look for any value in the column that has the integer 2 and retain only those rows. The filter call is on a copy, meaning it does not change the original DataFrame. If we instead want to filter destructively, we should call Filter instead.

Parameters

  • text source

Return

Libraries.Compute.Statistics.DataFrame:

Example

use Libraries.Compute.Statistics.DataFrame
DataFrame frame
frame:Load("file.csv")
DataFrame newFrame = frame:Copy("Group > 3")
output newFrame:ToText()

Copy()

This action returns a copy of the data frame, which deep copies every column.

Return

Libraries.Compute.Statistics.DataFrame: A copy of the data frame with constrained columns and rows

Example

use Libraries.Compute.Statistics.DataFrame

//Load a comma separated file
DataFrame frame
frame:Load("Data.csv")

//return a copy of the entire DataFrame
DataFrame copy = frame:Copy()

Copy(integer columnStart, integer columnEnd, integer rowStart, integer rowEnd)

Returns a copy of the data frame, except that it only copies certain columns and rows

Parameters

  • integer columnStart: The 0-indexed first column
  • integer columnEnd: The 0-indexed last column
  • integer rowStart: The 0-indexed first row
  • integer rowEnd: The 0-indexed last row

Return

Libraries.Compute.Statistics.DataFrame: A copy of the data frame with constrained columns and rows

Example

use Libraries.Compute.Statistics.DataFrame

//Load a comma separated file
DataFrame frame
frame:Load("Data.csv")

//return a copy with a max of the first five rows
DataFrame copy = frame:Copy(0, frame:GetSize(), 0, 5)

CopySelectedColumns()

Returns a copy of the data frame, except that it only copies certain columns

Return

Libraries.Compute.Statistics.DataFrame: A copy of the data frame with only selected columns

Example

use Libraries.Compute.Statistics.DataFrame

//Load a comma separated file
DataFrame frame
frame:Load("Data.csv")
frame:AddSelectedColumn(0)
frame:AddSelectedColumn(1)

//return a copy with a max of the first five rows
DataFrame copy = frame:CopySelectedColumns()

CorrelateSelectedColumns()

This action uses the selection to conduct a correlation between two or more independent columns. The CorrelateGroups object returned gives information back in several formats, including text formatted in the American Psychological Association (APA) style. This action runs a test based on how many columns are selected: 2 Pearson Correlation 3+ Pairwise Pearson Correlation

Return

Libraries.Compute.Statistics.Tests.CorrelateGroups: an object representing the comparison

Example

use Libraries.Compute.Statistics.DataFrame
use Libraries.Compute.Statistics.Tests.CorrelateGroups
    
DataFrame frame
frame:Load("Data.csv")
frame:AddSelectedColumnRange(0,2)
CorrelateGroups correlate = frame:CorrelateSelectedColumns()
output correlate:GetFormalSummary()

CorrelateSelectedRankedColumns()

This action uses the selection to conduct a correlation between two or more independent ranked columns. The CorrelateGroups object returned gives information back in several formats, including text formatted in the American Psychological Association (APA) style. This action runs a test based on how many columns are selected: 2 Spearman Correlation 3+ Pairwise Spearman Correlation

Return

Libraries.Compute.Statistics.Tests.CorrelateGroups: an object representing the correlation

Example

use Libraries.Compute.Statistics.DataFrame
use Libraries.Compute.Statistics.Tests.CorrelateGroups
    
DataFrame frame
frame:Load("Data.csv")
frame:AddSelectedColumnRange(0,2)
CorrelateGroups correlate = frame:CorrelateSelectedRankedColumns()
output correlate:GetFormalSummary()

CorrelationMatrix()

The CorrelationMatrixTransform class takes a DataFrame and its selection to decide how to transform it. Specifically, CorrelationMatrixTransform takes the every pair of two columns in the selection and computes the pearson correlation placing it into a matrix. The diagonal of that matrix will contain 1.0 for each column.

Return

Libraries.Compute.Matrix:

Example


    use Libraries.Compute.Statistics.DataFrame
  
    DataFrame frame
    frame:Load("Data.csv")
    frame:AddSelectedColumn(0)
    frame:AddSelectedColumn(1)
    frame:AddSelectedColumn(2)
    frame:AddSelectedColumn(3)
    
    Matrix corMatrix = frame:CorrelationMatrix()
    output corMatrix:ToText()

CovarianceMatrix()

The CovarianceMatrixTransform class takes a DataFrame and its selection to decide how to transform it. Specifically, CovarianceMatrixTransform takes the every pair of two columns in the selection and computes the covariance placing it into a matrix. The diagonal of that matrix will contain the variance for each column.

Return

Libraries.Compute.Matrix:

Example


    use Libraries.Compute.Statistics.DataFrame
  
    DataFrame frame
    frame:Load("Data.csv")
    frame:AddSelectedColumn(0)
    frame:AddSelectedColumn(1)
    frame:AddSelectedColumn(2)
    frame:AddSelectedColumn(3)
    
    Matrix covMatrix = frame:CovarianceMatrix()
    output covMatrix:ToText()

CreateChart(Libraries.Compute.Statistics.DataFrameChartCreator creator, boolean sort)

This action creates a chart, given a particular DataFrameChartCreator instance and returns a chart from it for this particular data.

Parameters

Return

Libraries.Interface.Controls.Charts.Chart: A chart object, which can be embedded into a user interface

Example

use Libraries.Compute.Statistics.DataFrame
use Libraries.Compute.Statistics.Charts.BarChartCreator
use Libraries.System.File

//Load a comma separated file
DataFrame frame
frame:Load("Data.csv")

//We might instantiate an object to create a bar chart, setting some properties if we want to
BarChartCreator create
frame:CreateChart(create)

CreateChart(Libraries.Compute.Statistics.DataFrameChartCreator creator)

This action creates a chart, given a particular DataFrameChartCreator instance and returns a chart from it for this particular data.

Parameters

Return

Libraries.Interface.Controls.Charts.Chart: A chart object, which can be embedded into a user interface

Example

use Libraries.Compute.Statistics.DataFrame
use Libraries.Compute.Statistics.Charts.BarChartCreator
use Libraries.System.File

//Load a comma separated file
DataFrame frame
frame:Load("Data.csv")

//We might instantiate an object to create a bar chart, setting some properties if we want to
BarChartCreator create
frame:CreateChart(create)

CreateNewDataFrameFromFactoredColumns()

Helper action for ByColumn() style charts, used to create new columns after grouping values by factor

Return

Libraries.Compute.Statistics.DataFrame:

CrossTab()

The CrossTab class takes a DataFrame and its selection to decide how to transform it. Specifically, CrossTab takes the first column in the selection and places it on the left-most column of the new frame, then the second column and places it on top. In both cases, the columns are filtered for unique items and sorted. Once these columns are placed, the CrossTab calculates how many of each unique item pair exist in the original DataFrame. For example, if the first row of the original DataFrame has the value 'test' and the right-most value has '11.2,' then the position in the transformed CrossTab DataFrame will increment this value in the table. Thus, if there were no other test, 11.2 pairs, this value in the new DataFrame would be 1.

Return

Libraries.Compute.Statistics.DataFrame:

Example


    use Libraries.Compute.Statistics.DataFrame
    
    //Create a DataFrame, load it, and set which column to focus on
    DataFrame frame
    frame:Load("Words.csv")
    frame:AddSelectedColumn(2)
    frame:AddSelectedColumn(3)
    
    DataFrame crossTab = frame:CrossTab()
    crossTab:Save("Cross.csv")
    output "File Saved."

DonutChart()

This action creates a PieChart with a donut hole of 0.5 (50%) from the current selection in the DataFrame. By default, it uses all columns as separate values in the selection as the chart area.

Return

Libraries.Interface.Controls.Charts.PieChart: a chart that can be displayed or placed into a user interface or game.

Example

use Libraries.Compute.Statistics.DataFrame
use Libraries.Interface.Controls.Charts.Piechart

DataFrame frame
frame:Load("Data.csv")
frame:AddSelectedColumn(1)
frame:AddSelectedColumn(2)
PieChart chart = frame:DonutChart()
chart:SetTitle("My Awesome Title")
chart:Display()

DonutChart(number donutHolePercent)

This action creates a PieChart with a donut hole of variable size from the current selection in the DataFrame. By default, it uses all columns as separate values in the selection as the chart area.

Parameters

  • number donutHolePercent

Return

Libraries.Interface.Controls.Charts.PieChart: a chart that can be displayed or placed into a user interface or game.

Example

use Libraries.Compute.Statistics.DataFrame
use Libraries.Interface.Controls.Charts.Piechart

DataFrame frame
frame:Load("Data.csv")
frame:AddSelectedColumn(1)
frame:AddSelectedColumn(2)
PieChart chart = frame:DonutChart(0.25)
chart:SetTitle("My Awesome Title")
chart:Display()

DonutChartByColumn(number donutHolePercent)

This action creates a PieChart from the sum of values of selected columns grouped by the selected factors.

Parameters

  • number donutHolePercent

Return

Libraries.Interface.Controls.Charts.PieChart: a PieChart chart that can be displayed or placed into a user interface or game.

Example

use Libraries.Compute.Statistics.DataFrame
use Libraries.Interface.Controls.Charts.PieChart

DataFrame frame
frame:Load("Data.csv")
frame:AddSelectedColumn(0)
frame:AddSelectedFactor(3)

PieChart chart = frame:PieChartByColumn()
chart:Display()

DonutChartByColumn()

This action creates a PieChart from the sum of values of selected columns grouped by the selected factors.

Return

Libraries.Interface.Controls.Charts.PieChart: a PieChart chart that can be displayed or placed into a user interface or game.

Example

use Libraries.Compute.Statistics.DataFrame
use Libraries.Interface.Controls.Charts.PieChart

DataFrame frame
frame:Load("Data.csv")
frame:AddSelectedColumn(0)
frame:AddSelectedFactor(3)

PieChart chart = frame:PieChartByColumn()
chart:Display()

EmptyColumnsOnLoad()

This action empties the columns loaded and frees up memory from the initialization. After any load operation, this action is automatically called.

EmptySelectedCells()

This action empties the selection, so that no rows are selected.

Example

use Libraries.Compute.Statistics.DataFrame
    
DataFrame frame
frame:Load("Data.csv")
frame:AddSelectedCell(0,0)
frame:EmptySelectedCells()
output frame:ToText()

EmptySelectedColumns()

This action empties the selection, so that no columns are selected.

Example

use Libraries.Compute.Statistics.DataFrame
    
DataFrame frame
frame:Load("Data.csv")
frame:AddSelectedFactor(0)
frame:EmptySelectedColumns()
output frame:ToText()

EmptySelectedFactors()

This action empties the selection, so that no factors are selected.

Example

use Libraries.Compute.Statistics.DataFrame
    
DataFrame frame
frame:Load("Data.csv")
frame:AddSelectedFactor(0)
frame:EmptySelectedFactors()
output frame:ToText()

EmptySelectedRows()

This action empties the selection, so that no rows are selected.

Example

use Libraries.Compute.Statistics.DataFrame
    
DataFrame frame
frame:Load("Data.csv")
frame:AddSelectedRow(0)
frame:EmptySelectedRows()
output frame:ToText()

Equals(Libraries.Language.Object object)

This action determines if two objects are equal based on their hash code values.

Parameters

Return

boolean: True if the hash codes are equal and false if they are not equal.

Example

use Libraries.Language.Object
use Libraries.Language.Types.Text
Object o
Text t
boolean result = o:Equals(t)

Filter(text source)

This action takes a Quorum expression text value and then filters the DataFrame. The expression follows the normal rules for Quorum, using the DataFrame's columns as the allowable variables. For example, if a DataFrame has a column, Group, and is an integer, then a value like Group = 2 would look for any value in the column that has the integer 2 and retain only those rows. The filter call is destructive, meaning it changes the DataFrame itself. If we instead want to obtain a new DataFrame with only the non-filtered data, but to retain this one, the same expression can be used in the Copy action.

Parameters

  • text source

Example

use Libraries.Compute.Statistics.DataFrame
DataFrame frame
frame:Load("file.csv")
frame:Filter("Group > 3")
output frame:ToText()

GeoMap(text mapBoundaryFilePath)

This action creates a GeoMap from the current selection in the DataFrame using a custom map By default, it uses all columns as separate values in the selection as the chart area.

Parameters

  • text mapBoundaryFilePath

Return

Libraries.Interface.Controls.Charts.GeoMap: a chart that can be displayed or placed into a user interface or game.

Example

use Libraries.Compute.Statistics.DataFrame
use Libraries.Interface.Controls.Charts.GeoMap

DataFrame frame
frame:Load("Data.csv")
frame:AddSelectedColumn(1)
frame:AddSelectedColumn(2)
GeoMap chart = frame:GeoMap("filePathToBoundaryData")
chart:SetTitle("My Awesome Title")
chart:Display()

GeoMap()

This action creates a GeoMap from the current selection in the DataFrame. By default, it uses all columns as separate values in the selection as the chart area.

Return

Libraries.Interface.Controls.Charts.GeoMap: a chart that can be displayed or placed into a user interface or game.

Example

use Libraries.Compute.Statistics.DataFrame
use Libraries.Interface.Controls.Charts.GeoMap

DataFrame frame
frame:Load("Data.csv")
frame:AddSelectedColumn(1)
frame:AddSelectedColumn(2)
GeoMap chart = frame:GeoMap()
chart:SetTitle("My Awesome Title")
chart:Display()

GetColumn(text header)

This action returns the first column with the name "header" in its header row. If multiple columns have the same name, then to get them all, you will need to iterate and find each one using GetColumn(integer) instead.

Parameters

  • text header: The column we want back.

Return

Libraries.Compute.Statistics.DataFrameColumn: the column, possibly undefined if no column of that name exists

Example

use Libraries.Compute.Statistics.DataFrame

//Load a comma separated file
DataFrame frame
frame:Load("Data.csv") 
DataFrameColumn column = frame:GetColumn("Gender")

GetColumn(integer index)

This action obtains a column from the DataFrame. This column is the original, not a copy, so modifications made to the column will be permanent. If the goal is to obtain a copy, then either the Copy actions or the Transform classes should be used.

Parameters

  • integer index: The column we want back.

Return

Libraries.Compute.Statistics.DataFrameColumn:

Example

use Libraries.Compute.Statistics.DataFrame

//Load a comma separated file
DataFrame frame
frame:Load("Data.csv") 
DataFrameColumn column = frame:GetColumn(0)

GetColumnOnLoad(integer index)

This action returns a column to load with at a particular index, if one exists.

Parameters

  • integer index

Return

Libraries.Compute.Statistics.DataFrameColumn:

GetColumns()

This action gets the columns in the DataFrame. This allows direct control of the columns for this particular data frame. We suggest not using these values directly unless required by an application.

Return

Libraries.Containers.Array:

Example

use Libraries.Compute.Statistics.DataFrame

//Load a comma separated file
DataFrame frame
frame:Load("Data.csv") 
Array<DataFrameColumn> col = frame:GetColumns()

GetHashCode()

This action gets the hash code for an object.

Return

integer: The integer hash code of the object.

Example

Object o
integer hash = o:GetHashCode()

GetHeaders()

This action gets the headers in the DataFrame. This allows the user to get all of the names at once as copies.

Return

Libraries.Containers.Array:

Example

use Libraries.Compute.Statistics.DataFrame

//Load a comma separated file
DataFrame frame
frame:Load("Data.csv") 
Array<text> col = frame:GetHeaders()

GetListeners()

This action returns an iterator of the listeners on the DataFrame's selection.

Return

Libraries.Containers.Iterator:

GetSelectedColumnSize()

This action obtains how many columns are selected in the selection.

Return

integer:

Example

use Libraries.Compute.Statistics.DataFrame
    
DataFrame frame
frame:Load("Data.csv")
frame:AddSelectedColumn(0)
output frame:GetSelectedColumnSize()

GetSelectedFactorSize()

This action obtains how many factors are selected in the selection.

Return

integer:

Example

use Libraries.Compute.Statistics.DataFrame
    
DataFrame frame
frame:Load("Data.csv")
frame:AddSelectedFactor(0)
output frame:GetSelectedFactorSize()

GetSelection()

This action gets the selection in the DataFrame.

Return

Libraries.Compute.Statistics.DataFrameSelection: the selection of the frame

GetSize()

This action returns the number of columns in the data frame. This value is not related to the number of rows in any particular column.

Return

integer: the number of columns in the DataFrame.

Example

use Libraries.Compute.Statistics.DataFrame

//Load a comma separated file
DataFrame frame
frame:Load("Data.csv") 
output frame:GetSize()

HasColumn(text header)

This action returns whether or not there is a column named "header" in its header row. If multiple columns have the same name, then this action returns true.

Parameters

  • text header: The column we want back.

Return

boolean: A boolean of true if a column of this name exists

Example

use Libraries.Compute.Statistics.DataFrame

//Load a comma separated file
DataFrame frame
frame:Load("Data.csv") 
DataFrameColumn column = frame:GetColumn("Gender")

Histogram(integer binWidth)

This action creates a Histogram from the current column selection in the DataFrame. By default, Histogram() uses the interquartile range to calculate its bin width, whereas in this version, this is overriden and this bin width is used instead.

Parameters

  • integer binWidth

Return

Libraries.Interface.Controls.Charts.Histogram: a Histogram chart that can be displayed or placed into a user interface or game.

Example

use Libraries.Compute.Statistics.DataFrame
use Libraries.Interface.Controls.Charts.Histogram

DataFrame frame
frame:Load("data.csv")    
frame:AddSelectedColumn(0)

Histogram chart = frame:Histogram(5)
chart:Display()

Histogram(number binWidth)

This action creates a Histogram from the current column selection in the DataFrame. By default, Histogram() uses the interquartile range to calculate its bin width, whereas in this version, this is overriden and this bin width is used instead.

Parameters

  • number binWidth

Return

Libraries.Interface.Controls.Charts.Histogram: a Histogram chart that can be displayed or placed into a user interface or game.

Example

use Libraries.Compute.Statistics.DataFrame
use Libraries.Interface.Controls.Charts.Histogram

DataFrame frame
frame:Load("data.csv")    
frame:AddSelectedColumn(0)

Histogram chart = frame:Histogram(5)
chart:Display()

Histogram()

This action creates a Histogram from the current column selection in the DataFrame. By default, it uses the interquartile range to calculate its bin width. This can be overriden by calling Histogram(integer) and passing the bin width directly.

Return

Libraries.Interface.Controls.Charts.Histogram: a Histogram chart that can be displayed or placed into a user interface or game.

Example

use Libraries.Compute.Statistics.DataFrame
use Libraries.Interface.Controls.Charts.Histogram

DataFrame frame
frame:Load("data.csv")    
frame:AddSelectedColumn(0)

Histogram chart = frame:Histogram()
chart:Display()

HistogramByColumn()

This action creates a Histogram from the values of selected columns grouped by the selected factors.

Return

Libraries.Interface.Controls.Charts.Histogram: a Histogram chart that can be displayed or placed into a user interface or game.

Example

use Libraries.Compute.Statistics.DataFrame
use Libraries.Interface.Controls.Charts.Histogram

DataFrame frame
frame:Load("Data.csv")
frame:AddSelectedColumn(0)
frame:AddSelectedFactor(3)

Histogram chart = frame:HistogramByColumn()
chart:Display()

HistogramByColumn(number binWidth)

This action creates a Histogram from the values of selected columns grouped by the selected factors.

Parameters

  • number binWidth

Return

Libraries.Interface.Controls.Charts.Histogram: a Histogram chart that can be displayed or placed into a user interface or game.

Example

use Libraries.Compute.Statistics.DataFrame
use Libraries.Interface.Controls.Charts.Histogram

DataFrame frame
frame:Load("Data.csv")
frame:AddSelectedColumn(0)
frame:AddSelectedFactor(3)

Histogram chart = frame:HistogramByColumn()
chart:Display()

InterQuartileRange()

This action calculates the InterQuartileRange of the selected column.

Return

Libraries.Compute.Statistics.Calculations.InterQuartileRange: the interquartile range

Example

use Libraries.Compute.Statistics.DataFrame
    
DataFrame frame
frame:Load("Data.csv")
frame:AddSelectedColumn(0)
InterQuartileRange range = frame:InterQuartileRange()

InterQuartileRangeSelectedColumns()

This action calculates the interquartile range of the selected columns. In this case, the full calculation objects are returned.

Return

Libraries.Containers.Array: an array of the calculations

Example

use Libraries.Compute.Statistics.DataFrame
    
DataFrame frame
frame:Load("Data.csv")
frame:AddSelectedColumnRange(0,2)
Array<InterQuartileRange> values = frame:InterQuartileRangeSelectedColumns()

IsEmpty()

This action returns true if the number of columns is zero

Return

boolean: true if the number of columns is zero.

Example

use Libraries.Compute.Statistics.DataFrame

//Load a comma separated file
DataFrame frame
frame:Load("Data.csv") 
output frame:IsEmpty()

Kurtosis()

This action calculates the kurtosis of the selected column.

Return

number: the kurtosis

Example

use Libraries.Compute.Statistics.DataFrame
    
DataFrame frame
frame:Load("Data.csv")
frame:AddSelectedColumn(0)
output frame:Kurtosis()

KurtosisSelectedColumns()

This action calculates the kurtosis of the selected columns. In this case, the full calculation objects are returned.

Return

Libraries.Containers.Array: an array of the calculations

Example

use Libraries.Compute.Statistics.DataFrame
    
DataFrame frame
frame:Load("Data.csv")
frame:AddSelectedColumnRange(0,2)
Array<Kurtosis> values = frame:KurtosisSelectedColumns()

LineChart()

This action creates a LineChart from the current column selection in the DataFrame. By default, it uses all columns as separate lines in the selection as the chart area. Only one factor is allowed and this factor controls the y-axis for all of the lines. The scale can be set manually by using the LineChart

Return

Libraries.Interface.Controls.Charts.LineChart: a BarChart chart that can be displayed or placed into a user interface or game.

Example

use Libraries.Compute.Statistics.DataFrame
use Libraries.Interface.Controls.Charts.LineChart

DataFrame frame
frame:Load("Data.csv")
frame:AddSelectedColumn(1)
frame:AddSelectedFactor(0)
LineChart chart = frame:LineChart()
chart:SetTitle("My Awesome Title")
chart:SetXAxisTitle("Time")
chart:Display()

LineChartByColumn()

This action creates a LineChart from the of values of selected columns grouped by the selected factors.

Return

Libraries.Interface.Controls.Charts.LineChart: a LineChart chart that can be displayed or placed into a user interface or game.

Example

use Libraries.Compute.Statistics.DataFrame
use Libraries.Interface.Controls.Charts.LineChart

DataFrame frame
frame:Load("Data.csv")
frame:AddSelectedColumn(0)
frame:AddSelectedFactor(3)

LineChart chart = frame:LineChartByColumn()
chart:Display()

Load(Libraries.Compute.Statistics.DataFrameLoader loader)

This action loads data a particular loader, which may do so from memory in any other way, not necessary related to a file or from disk.

Parameters

Example

use Libraries.Compute.Statistics.DataFrame
use Libraries.Compute.Statistics.Loaders.WebDataFrameLoader

DataFrame frame
WebDataFrameLoader loader
loader:SetLocation("SomeLocation")
frame:Load(loader)

Load(Libraries.System.File file, Libraries.Compute.Statistics.DataFrameLoader loader)

This action loads data from a file and then places it into the existing data frame. If data already exists in this data frame, it is discarded and replaced.

Parameters

Example

use Libraries.Compute.Statistics.DataFrame
use Libraries.Compute.Statistics.Loaders.CommaSeparatedLoader
use Libraries.System.File

//Load a comma separated file
DataFrame frame
File file
file:SetPath("Data.csv")
CommaSeparatedLoader loader
frame:Load(file, loader)

Load(text location)

This action loads a data frame from a file relative to the working directory, which is typically where the executable lives.

Parameters

  • text location: The file to load, parsed as text relative to the working directly. This is usually the directory of the executable.

Example

use Libraries.Compute.Statistics.DataFrame

//Load a comma separated file
DataFrame frame
frame:Load("Data.csv")

Load(Libraries.System.File file)

This action loads a data frame from a file.

Parameters

Example

use Libraries.Compute.Statistics.DataFrame
use Libraries.System.File

//Load a comma separated file
DataFrame frame
File file
file:SetPath("Data.csv")
frame:Load(file)

LoadFromCommaSeparatedValue(text value)

This action loads a data frame from text formatted as a Comma Separated Value (CSV).

Parameters

  • text value

Example

use Libraries.Compute.Statistics.DataFrame
use Libraries.System.File

//Load a comma separated text value
DataFrame frame
text value = "Hello, how, are, you
    1, 2, 3, 4
    2, 3, 4, 5"
frame:LoadFromCommaSeparatedValue(value)

Mean()

This action calculates the mean of the selected column.

Return

number: the mean

Example

use Libraries.Compute.Statistics.DataFrame
    
DataFrame frame
frame:Load("Data.csv")
frame:AddSelectedColumn(0)
output frame:Mean()

MeanSelectedColumns()

This action calculates the mean of the selected columns. In this case, the full calculation objects are returned.

Return

Libraries.Containers.Array: an array of the calculations

Example

use Libraries.Compute.Statistics.DataFrame
    
DataFrame frame
frame:Load("Data.csv")
frame:AddSelectedColumnRange(0,2)
Array<Mean> values = frame:MeanSelectedColumns()

Median()

This action calculates the median of the selected column.

Return

number: the median

Example

use Libraries.Compute.Statistics.DataFrame
    
DataFrame frame
frame:Load("Data.csv")
frame:AddSelectedColumn(0)
output frame:Median()

MedianSelectedColumns()

This action calculates the median of the selected columns. In this case, the full calculation objects are returned.

Return

Libraries.Containers.Array: an array of the calculations

Example

use Libraries.Compute.Statistics.DataFrame
    
DataFrame frame
frame:Load("Data.csv")
frame:AddSelectedColumnRange(0,2)
Array<Median> values = frame:MedianSelectedColumns()

NormalityCheckChart()

This action creates what is often called a QQ plot, or quartile quartile plot, in the academic literature. The broad idea is that it maps the values in a column to the theoretical ranking values in a normal distribution. If the chart is largely linear up and to the right, then the data appears to be reasonably, although informally, normally distributed. This action is a helper that can only conduct this calculation on one column for a normal distribution, although the broad idea could be extended further if needed.

Return

Libraries.Interface.Controls.Charts.ScatterPlot: a ScatterPlot that can be displayed or placed into a user interface or game.

Example

use Libraries.Compute.Statistics.DataFrame
use Libraries.Interface.Controls.Charts.ScatterPlot

DataFrame frame
frame:Load("Data.csv")
frame:AddSelectedColumn(1)
ScatterPlot chart = frame:NormalityCheckChart()
chart:SetTitle("My Normality Check Chart")
chart:Display()

NormalityCheckChartByColumn()

This action creates a NormalityCheckChart from the of values of selected columns grouped by the selected factors.

Return

Libraries.Interface.Controls.Charts.ScatterPlot: a NormalityCheckChart chart that can be displayed or placed into a user interface or game.

Example

use Libraries.Compute.Statistics.DataFrame
use Libraries.Interface.Controls.Charts.ScatterPlot

DataFrame frame
frame:Load("Data.csv")
frame:AddSelectedColumn(0)
frame:AddSelectedFactor(3)

ScatterPlot chart = frame:NormalityCheckChartByColumn()
chart:Display()

PieChart()

This action creates a PieChart from the current selection in the DataFrame. By default, it uses all columns as separate values in the selection as the chart area.

Return

Libraries.Interface.Controls.Charts.PieChart: a chart that can be displayed or placed into a user interface or game.

Example

use Libraries.Compute.Statistics.DataFrame
use Libraries.Interface.Controls.Charts.Piechart

DataFrame frame
frame:Load("Data.csv")
frame:AddSelectedColumn(1)
frame:AddSelectedColumn(2)
PieChart chart = frame:PieChart()
chart:SetTitle("My Awesome Title")
chart:Display()

PieChartByColumn()

This action creates a PieChart from the sum of values of selected columns grouped by the selected factors.

Return

Libraries.Interface.Controls.Charts.PieChart: a PieChart chart that can be displayed or placed into a user interface or game.

Example

use Libraries.Compute.Statistics.DataFrame
use Libraries.Interface.Controls.Charts.PieChart

DataFrame frame
frame:Load("Data.csv")
frame:AddSelectedColumn(0)
frame:AddSelectedFactor(3)

PieChart chart = frame:PieChartByColumn()
chart:Display()

PieChartByColumnSum()

This action creates a PieChart from the sum of values of selected columns grouped by the selected factors.

Return

Libraries.Interface.Controls.Charts.PieChart: a PieChart chart that can be displayed or placed into a user interface or game.

Example

use Libraries.Compute.Statistics.DataFrame
use Libraries.Interface.Controls.Charts.PieChart

DataFrame frame
frame:Load("Data.csv")
frame:AddSelectedColumn(0)
frame:AddSelectedFactor(3)

PieChart chart = frame:PieChartByColumn()
chart:Display()

PopulationVariance()

This action calculates the population variance of the selected column.

Return

number: the variance

Example

use Libraries.Compute.Statistics.DataFrame
    
DataFrame frame
frame:Load("Data.csv")
frame:AddSelectedColumn(0)
output frame:Variance()

PopulationVarianceSelectedColumns()

This action calculates the population variance of the selected columns. In this case, the full calculation objects are returned.

Return

Libraries.Containers.Array: an array of the calculations

Example

use Libraries.Compute.Statistics.DataFrame
    
DataFrame frame
frame:Load("Data.csv")
frame:AddSelectedColumnRange(0,2)
Array<Variance> values = frame:VarianceSelectedColumns()

PrincipalComponentAnalysis()

Reduces the dimensionality of a model in an attempt to maximize variances while maintaining the same explanatory outcome. The PrincipalComponentAnalysis object returned gives information back in several formats, including text formatted in the American Psychological Association (APA) style. This action runs a test based on how many columns are selected: 2+ Kaiser-Meyer-Olkin Measure Of Sampling Adequacy

Return

Libraries.Compute.Statistics.Tests.PrincipalComponentAnalysis: an object representing the test

Example

use Libraries.Compute.Statistics.DataFrame
use Libraries.Compute.Statistics.Tests.PrincipalComponentAnalysis
    
DataFrame frame
frame:Load("Data.csv")
frame:AddSelectedColumnRange(0,4)
PrincipalComponentAnalysis pca = frame:PrincipalComponentAnalysis()
output pca:GetFormalSummary()

RegressionOnSelected()

This action uses the selection to conduct a regression. The comparison is traditionally called a Regression. The Regression object returned gives information back in several formats, including text formatted in the American Psychological Association (APA) style.

Return

Libraries.Compute.Statistics.Tests.Regression: an object representing the comparison

Example

use Libraries.Compute.Statistics.DataFrame
    
DataFrame frame
frame:Load("Data.csv")
frame:AddSelectedColumn(0)
frame:AddSelectedFactor(1)
frame:AddSelectedFactor(2)
frame:AddSelectedFactor(3)
Regression result = frame:RegressionOnSelected()
output result:GetFormalSummary()

Remove(Libraries.Compute.Statistics.DataFrameSelectionListener listener)

Classes can also de-register as listeners of the selection in the DataFrame.

Parameters

RemoveColumnAt(integer index)

This action removes a column from the data frame. It is destructive in that it changes the existing DataFrame without making a copy.

Parameters

  • integer index

Example

//We need the DataFrame class to load in files for Data Science operations.
use Libraries.Compute.Statistics.DataFrame
use Libraries.Compute.Statistics.Columns.NumberColumn
use Libraries.Containers.Array
use Libraries.Compute.Statistics.DataFrameColumn

//Create a DataFrame, which is essentially a table that understands 
//more information about the data that is being loaded.
DataFrame frame

//This creates a NumberColumn, which contains numbers
NumberColumn column
column:SetHeader("My Column")
column:Add(1)
column:Add(2)
column:Add(3)
column:Add(4)
column:Add(5)
column:Add(6)
frame:AddColumn(column)

//The system loaded the file, but can also output it a text value, or the console, if we want that.
output frame:ToText()

RemoveColumnOnLoad(integer index)

This action removes a column from the on load procedure.

Parameters

  • integer index

RemoveSelectedCell(Libraries.Containers.Support.Pair<integer> index)

This removes a row of a particular index anywhere from the selection.

Parameters

Example

use Libraries.Compute.Statistics.DataFrame
use Libraries.Containers.Support.Pair

DataFrame frame
frame:Load("Data.csv")
Pair<integer> pair
pair:Set(0,0)

//be sure to send the same object
//the system looks at the object code, not
//the value in the pair
frame:AddSelectedCell(pair)
frame:RemoveSelectedCell(pair)
output frame:ToText()

RemoveSelectedCellAt(integer index)

This removes a cell at the index of the selection. This is the index of the cell in the selection, not the x,y coordinate of the cell to remove.

Parameters

  • integer index: the selection index of the cell to remove

Example

use Libraries.Compute.Statistics.DataFrame
    
DataFrame frame
frame:Load("Data.csv")
frame:AddSelectedCell(0,0)

//this is the first value in the selection, so remove at index 0.
frame:RemoveSelectedCellAt(0)
output frame:ToText()

RemoveSelectedColumn(integer index)

This removes a column of a particular index anywhere from the selection.

Parameters

  • integer index: the column index of the column to remove

Example

use Libraries.Compute.Statistics.DataFrame
    
DataFrame frame
frame:Load("Data.csv")
frame:AddSelectedColumn(0)
frame:RemoveSelectedColumn(0)
output frame:ToText()

RemoveSelectedColumnAt(integer index)

This removes a column at the index of the selection.

Parameters

  • integer index: the selection index of the column to remove

Example

use Libraries.Compute.Statistics.DataFrame
    
DataFrame frame
frame:Load("Data.csv")
frame:AddSelectedColumn(0)
frame:RemoveSelectedColumnAt(0)
output frame:ToText()

RemoveSelectedColumnRange(integer start, integer finish)

This adds removes columns from the selected range, starting from start and ending at finish, inclusive. In this case, this means that calculations will be conducted across this entire range.

Parameters

  • integer start: the start of the range
  • integer finish: the end of the range

Example

use Libraries.Compute.Statistics.DataFrame
    
DataFrame frame
frame:Load("Data.csv")
frame:AddSelectedColumnRange(0, 2)
frame:RemoveSelectedColumnRange(1, 2)
output frame:ToText()

RemoveSelectedColumns(text headers)

This action reads a comma separated list of header names and determines the indices from this list. This action is inherently strict, where if the parsing fails, the headers are not unique, or there are other issues in the list, this action throws an error. This action removes the selection from the list of headers.

Parameters

  • text headers: the columns to select

Example

use Libraries.Compute.Statistics.DataFrame
    
DataFrame frame
frame:Load("Data.csv")
frame:SetSelectedColumns("name1,name2")
frame:RemoveSelectedColumns("name2")
output frame:GetSelectedColumnSize()

RemoveSelectedFactor(integer index)

This removes a factor of a particular index anywhere from the selection.

Parameters

  • integer index: the factor index of the factor to remove

Example

use Libraries.Compute.Statistics.DataFrame
    
DataFrame frame
frame:Load("Data.csv")
frame:AddSelectedFactor(0)
frame:RemoveSelectedFactor(0)
output frame:ToText()

RemoveSelectedFactorAt(integer index)

This removes a factor at a particular index from the selection.

Parameters

  • integer index: the selection index of the factor to remove

Example

use Libraries.Compute.Statistics.DataFrame
    
DataFrame frame
frame:Load("Data.csv")
frame:AddSelectedFactor(0)
frame:RemoveSelectedFactorAt(0)
output frame:ToText()

RemoveSelectedFactorRange(integer start, integer finish)

This adds removes factors from the selected range, starting from start and ending at finish, inclusive. In this case, this means that calculations will be conducted across this entire range.

Parameters

  • integer start: the start of the range
  • integer finish: the end of the range

Example

use Libraries.Compute.Statistics.DataFrame
    
DataFrame frame
frame:Load("Data.csv")
frame:AddSelectedFactorRange(0, 2)
frame:RemoveSelectedFactorRange(1, 2)
output frame:ToText()

RemoveSelectedFactors(text headers)

This action reads a comma separated list of header names and determines the indices from this list. This action is inherently strict, where if the parsing fails, the headers are not unique, or there are other issues in the list, this action throws an error. This action removes the selection from the list of headers.

Parameters

  • text headers: the columns to select

Example

use Libraries.Compute.Statistics.DataFrame
    
DataFrame frame
frame:Load("Data.csv")
frame:SetSelectedFactors("name1,name2")
frame:RemoveSelectedFactors("name2")
output frame:GetSelectedColumnSize()

RemoveSelectedRow(integer index)

This removes a row of a particular index anywhere from the selection.

Parameters

  • integer index: the row index of the row to remove

Example

use Libraries.Compute.Statistics.DataFrame
    
DataFrame frame
frame:Load("Data.csv")
frame:AddSelectedRow(0)
frame:RemoveSelectedRow(0)
output frame:ToText()

RemoveSelectedRowAt(integer index)

This removes a column at the index of the selection.

Parameters

  • integer index: the selection index of the column to remove

Example

use Libraries.Compute.Statistics.DataFrame
    
DataFrame frame
frame:Load("Data.csv")
frame:AddSelectedColumn(0)
frame:RemoveSelectedColumnAt(0)
output frame:ToText()

RemoveUndefinedRows()

This action removes undefined values from the DataFrame. This action creates a copy of the DataFrame, so it is not destructive. For this version, which does not take into account selection, this action removes a row if any value, in any column, is undefined.

Return

Libraries.Compute.Statistics.DataFrame: a new DataFrame copy with all rows with undefined in any position removed

Example

use Libraries.Compute.Statistics.DataFrame
DataFrame frame
frame:Load("file.csv")
output frame:RemoveUndefinedRows():ToText()

RemoveUndefinedRowsFromSelectedColumns()

This action removes undefined values from the DataFrame. This action creates a copy of the DataFrame, so it is not destructive. For this version, rows are removed only if the undefined value was in a selected column. Selected factors are not taken into account.

Return

Libraries.Compute.Statistics.DataFrame: a new DataFrame copy with all rows with undefined in any position removed

Example

use Libraries.Compute.Statistics.DataFrame
DataFrame frame
frame:Load("file.csv")
output frame:RemoveUndefinedRows():ToText()

ReplaceUndefined(text value)

This action replaces undefined values in the DataFrame with a value derived from the text. If the text will not result in a replacement, the request is ignored in the column. This action replaces the undefined values across the entire DataFrame.

Parameters

  • text value

Example

use Libraries.Compute.Statistics.DataFrame
DataFrame frame
frame:Load("file.csv")
//perhaps this replaces an undefined in an IntegerColumn or a TextColumn
frame:ReplaceUndefined("0")
output frame:ToText()

ReplaceUndefinedFromSelectedColumns(text value)

This action replaces undefined values in the DataFrame with a value derived from the text. If the text will not result in a replacement, the request is ignored in the column. This action replaces the undefined values across only the selected columns. Selected factors are ignored.

Parameters

  • text value

Example

use Libraries.Compute.Statistics.DataFrame
DataFrame frame
frame:Load("file.csv")

//Add the first column in the set to the list of selected columns
frame:AddSelectedColumn(0)

//perhaps this replaces an undefined in an IntegerColumn or a TextColumn
frame:ReplaceUndefinedFromSelectedColumns("0")
output frame:ToText()

RoundSelectedColumns(integer decimalPlace)

This action rounds values in selected number columns to a specified decimal place. If the value is 1.3209873 and a decimal place is set to 2 the value will be rounded to 1.32.

Parameters

  • integer decimalPlace

Return

Libraries.Compute.Statistics.DataFrame:

Example

use Libraries.Compute.Statistics.DataFrame
DataFrame frame
frame:Load("file.csv")

//set which column or columns we want to round values
frame:AddSelectedColumn(3)
DataFrame result = frame:RoundSelectedColumns(2)
output result:ToText()

Save(text location)

This action saves a data frame from a file relative to the working directory, which is typically where the executable lives. The file must have a csv file extension for this to save. Otherwise, it fails silently.

Parameters

  • text location: The file to save. This action uses the default format of comma separate values (CSV).

Example

use Libraries.Compute.Statistics.DataFrame

//Load a comma separated file
DataFrame frame
frame:Save("Data.csv")

Save(Libraries.System.File file, Libraries.Compute.Statistics.DataFrameSaver saver)

This action data to a file from a data frame.

Parameters

Example

use Libraries.Compute.Statistics.DataFrame
use Libraries.Compute.Statistics.Loaders.CommaSeparatedSaver
use Libraries.System.File

//Load a comma separated file
DataFrame frame
File file
file:SetPath("Data.csv")
CommaSeparatedLoader loader
frame:Load(file, loader) 

CommaSeparatedSaver saver
frame:Save(file, saver)

ScatterPlot()

This action creates a ScatterPlot from the current two column selection in the DataFrame. By default, it uses all columns as separate values in the selection as the chart area. Factors have no impact on this action.

Return

Libraries.Interface.Controls.Charts.ScatterPlot: a chart that can be displayed or placed into a user interface or game.

Example

use Libraries.Compute.Statistics.DataFrame
use Libraries.Interface.Controls.Charts.ScatterPlot

DataFrame frame
frame:Load("Data.csv")
frame:AddSelectedColumn(1)
frame:AddSelectedColumn(2)
ScatterPlot chart = frame:ScatterPlot()
chart:SetTitle("My Awesome Title")
chart:SetXAxisTitle("Time")
chart:Display()

ScatterPlotByColumn()

This action creates a BoxPlot from the of values of two selected columns grouped by the selected factors.

Return

Libraries.Interface.Controls.Charts.ScatterPlot: a chart that can be displayed or placed into a user interface or game.

Example

use Libraries.Compute.Statistics.DataFrame
use Libraries.Interface.Controls.Charts.ScatterPlot

DataFrame frame
frame:Load("Data.csv")
frame:AddSelectedColumn(1)
frame:AddSelectedColumn(2)
ScatterPlot chart = frame:ScatterPlot()
chart:SetTitle("My Awesome Title")
chart:SetXAxisTitle("Time")
chart:Display()

ScreePlot()

This action creates a ScreePlot from the correlation matrix of the selected columns in the DataFrame. No factors are allowed for this chart type and it provides only a line chart. Copying the ScreePlot Creator's implementation allows the creation of alternative chart types for the same chart, but this is not supported by default.

Return

Libraries.Interface.Controls.Charts.LineChart: a LineChart that can be displayed or placed into a user interface or game.

Example

use Libraries.Compute.Statistics.DataFrame
use Libraries.Interface.Controls.Charts.LineChart

DataFrame frame
frame:Load("Data.csv")
frame:SelectAllColumns()
LineChart chart = frame:ScreePlot()
chart:SetTitle("My Scree Plot")
chart:Display()

SelectAllColumns()

This removes the current selected columns and adds all columns into the selection. Selected factors are not changed.

Example

use Libraries.Compute.Statistics.DataFrame
    
DataFrame frame
frame:Load("Data.csv")
frame:SelectAllColumns()
output frame:ToText()

SetColumns(Libraries.Containers.Array<Libraries.Compute.Statistics.DataFrameColumn> columns)

This action replaces the columns in the DataFrame. It is needed by the Loader infrastructure in order to change the columns. However, for most users, the Transform infrastructure should be used instead of adjusting these manually. In other words, do not use this action unless you know what you are doing.

Parameters

Example

use Libraries.Compute.Statistics.DataFrame

//Load a comma separated file
DataFrame frame
frame:Load("Data.csv") 
Array<DataFrameColumn> col
frame:SetColumns(col)

SetSelectedColumnRange(integer start, integer finish)

This adds sets columns to the selected range, starting from start and ending at finish, inclusive. In this case, this means that calculations will be conducted across this entire range. If any previous range was indicated, it is removed.

Parameters

  • integer start: the start of the range
  • integer finish: the end of the range

Example

use Libraries.Compute.Statistics.DataFrame
    
DataFrame frame
frame:Load("Data.csv")
frame:SetSelectedColumnRange(0, 2)
output frame:ToText()

SetSelectedColumns(text headers)

This action reads a comma separated list of header names and determines the indices from this list. This action is inherently strict, where if the parsing fails, the headers are not unique, or there are other issues in the list, this action throws an error. This action removes any previous selection.

Parameters

  • text headers: the columns to select

Example

use Libraries.Compute.Statistics.DataFrame
    
DataFrame frame
frame:Load("Data.csv")
frame:SetSelectedColumns("name1,name2")
output frame:GetSelectedColumnSize()

SetSelectedFactorRange(integer start, integer finish)

This adds sets factors to the selected range, starting from start and ending at finish, inclusive. In this case, this means that calculations will be conducted across this entire range. If any previous range was indicated, it is removed.

Parameters

  • integer start: the start of the range
  • integer finish: the end of the range

Example

use Libraries.Compute.Statistics.DataFrame
    
DataFrame frame
frame:Load("Data.csv")
frame:SetSelectedFactorRange(0, 2)
output frame:ToText()

SetSelectedFactors(text headers)

This action reads a comma separated list of header names and determines the indices from this list. This action is inherently strict, where if the parsing fails, the headers are not unique, or there are other issues in the list, this action throws an error. This action removes any previous selection.

Parameters

  • text headers: the columns to select

Example

use Libraries.Compute.Statistics.DataFrame
    
DataFrame frame
frame:Load("Data.csv")
frame:SetSelectedFactors("name1,name2")
output frame:GetSelectedColumnSize()

SetSelection(Libraries.Compute.Statistics.DataFrameSelection selection)

This action gets the selection in the DataFrame.

Parameters

Skew()

This action calculates the skew of the selected column.

Return

number: the skew

Example

use Libraries.Compute.Statistics.DataFrame
    
DataFrame frame
frame:Load("Data.csv")
frame:AddSelectedColumn(0)
output frame:Skew()

SkewSelectedColumns()

This action calculates the skew of the selected columns. In this case, the full calculation objects are returned.

Return

Libraries.Containers.Array: an array of the calculations

Example

use Libraries.Compute.Statistics.DataFrame
    
DataFrame frame
frame:Load("Data.csv")
frame:AddSelectedColumnRange(0,2)
Array<Skew> values = frame:SkewSelectedColumns()

Sort(text headers)

This action reads a comma separated list of header names and determines the indices from this list. This action is inherently strict, where if the parsing fails, the headers are not unique, or there are other issues in the list, this action throws an error.

Parameters

  • text headers: the columns to sort

Example

use Libraries.Compute.Statistics.DataFrame
    
DataFrame frame
frame:Load("Data.csv")
frame:Sort("name1,name2")
output frame:ToText()

SplitSelectedColumns(Libraries.Containers.Array<text> headers, text delimiter)

This action splits the selected columns into new columns using the delimiter (typically a character) to split each row. For example, if we a column called stats with values like 22-2, and a delimiter of dash (-), we would split into two columns, one with 22 and the other with 2. This is the simplest case of column splitting. For more advanced forms of splitting, we can override the ColumnSplitter class and override a single action, Split, which takes a text value and returns an array of values to be placed into the new columns.

Parameters

Return

Libraries.Compute.Statistics.DataFrame:

Example

use Libraries.Compute.Statistics.DataFrame
DataFrame frame
frame:Load("file.csv")

//create the names for the new columns
Array<text> headers
headers:Add("points")
headers:Add("thingy")

//set which column or columns we want to split in this way
//then split the values and output them to the console
frame:AddSelectedColumn(3)
DataFrame result = frame:SplitSelectedColumns(headers, "-")
output result:ToText()

SplitSelectedColumns(Libraries.Containers.Array<text> headers, text delimiter, boolean includeUndefined)

This action splits the selected columns into new columns using the delimiter (typically a character) to split each row. For example, if we a column called stats with values like 22-2, and a delimiter of dash (-), we would split into two columns, one with 22 and the other with 2. This version can explicitly include or exclude undefined cells. This is the simplest case of column splitting. For more advanced forms of splitting, we can override the ColumnSplitter class and override a single action, Split, which takes a text value and returns an array of values to be placed into the new columns.

Parameters

Return

Libraries.Compute.Statistics.DataFrame:

Example

use Libraries.Compute.Statistics.DataFrame
DataFrame frame
frame:Load("file.csv")

//create the names for the new columns
Array<text> headers
headers:Add("points")
headers:Add("thingy")

//set which column or columns we want to split in this way
//then split the values and output them to the console
frame:AddSelectedColumn(3)
DataFrame result = frame:SplitSelectedColumns(headers, "-")
output result:ToText()

SplitSelectedColumnsRandomlyByRows(number percent)

The SplitRowsTransform class takes a data frame and transforms it into two DataFrames with copies of all of the data inside them. Specifically, you can set a percentage of the total data, like 0.8 for 80%, which then instructs the transform to select 80% of the rows, randomly, and place it into one DataFrame copy, with the remainder going in the other. This operation is synchronized across all columns so that it is the same split everywhere.

Parameters

  • number percent: the percent of the split in the first element of the return. The second will contain 1.0 - percent.

Return

Libraries.Containers.Array: an array of DataFrame objects with exactly 2 spots. The first is the selected rows and the second is the rejected rows.

Example

use Libraries.Compute.Statistics.DataFrame
use Libraries.Containers.Array

DataFrame frame
frame:LoadFromCommaSeparatedValue(
"Hello, Hi
0, 17
1, 19
2, 21
3, 23
4, 25
5, 27
6, 29
7, 31
8, 33
9, 35
10, 35"
)

frame:AddSelectedColumn(0)
frame:AddSelectedColumn(1)

//Take half the DataFrame's rows
Array<DataFrame> frames = frame:SplitSelectedColumnsRandomlyByRows(0.5)
output frames:Get(0):ToText()
output frames:Get(1):ToText()

SplitSelectedColumnsRandomlyByRows(number percent, number seed)

The SplitRowsTransform class takes a data frame and transforms it into two DataFrames with copies of all of the data inside them. Specifically, you can set a percentage of the total data, like 0.8 for 80%, which then instructs the transform to select 80% of the rows, randomly, and place it into one DataFrame copy, with the remainder going in the other. This operation is synchronized across all columns so that it is the same split everywhere.

Parameters

  • number percent: the percent of the split in the first element of the return. The second will contain 1.0 - percent.
  • number seed: The seed passed to the random number generator.

Return

Libraries.Containers.Array: an array of DataFrame objects with exactly 2 spots. The first is the selected rows and the second is the rejected rows.

Example

use Libraries.Compute.Statistics.DataFrame
use Libraries.Containers.Array

DataFrame frame
frame:LoadFromCommaSeparatedValue(
"Hello, Hi
0, 17
1, 19
2, 21
3, 23
4, 25
5, 27
6, 29
7, 31
8, 33
9, 35
10, 35"
)

frame:AddSelectedColumn(0)
frame:AddSelectedColumn(1)

//take half the DataFrame's rows and use a random seed of 0
Array<DataFrame> frames = frame:SplitSelectedColumnsRandomlyByRows(0.5, 0)
output frames:Get(0):ToText()
output frames:Get(1):ToText()

SplitSelectedColumnsRandomlyByRows()

The SplitRowsTransform class takes a data frame and transforms it into two DataFrames with copies of all of the data inside them. Specifically, you can set a percentage of the total data, like 0.8 for 80%, which then instructs the transform to select 80% of the rows, randomly, and place it into one DataFrame copy, with the remainder going in the other. This operation is synchronized across all columns so that it is the same split everywhere.

Return

Libraries.Containers.Array: an array of DataFrame objects with exactly 2 spots. The first is the selected rows and the second is the rejected rows.

Example

use Libraries.Compute.Statistics.DataFrame
use Libraries.Containers.Array

DataFrame frame
frame:LoadFromCommaSeparatedValue(
"Hello, Hi
0, 17
1, 19
2, 21
3, 23
4, 25
5, 27
6, 29
7, 31
8, 33
9, 35
10, 35"
)

frame:AddSelectedColumn(0)
frame:AddSelectedColumn(1)

Array<DataFrame> frames = frame:SplitSelectedColumnsRandomlyByRows()
output frames:Get(0):ToText()
output frames:Get(1):ToText()

StandardDeviation()

This action calculates the standard deviation of the selected column.

Return

number: the standard deviation

Example

use Libraries.Compute.Statistics.DataFrame
    
DataFrame frame
frame:Load("Data.csv")
frame:AddSelectedColumn(0)
output frame:StandardDeviation()

StandardDeviationSelectedColumns()

This action calculates the standard deviation of the selected columns. In this case, the full calculation objects are returned.

Return

Libraries.Containers.Array: an array of the calculations

Example

use Libraries.Compute.Statistics.DataFrame
    
DataFrame frame
frame:Load("Data.csv")
frame:AddSelectedColumnRange(0,2)
Array<StandardDeviation> values = frame:StandardDeviationSelectedColumns()

Summarize()

This action calculates summary information for the column.

Return

Libraries.Compute.Statistics.Calculations.Summarize: the summary

Example

use Libraries.Compute.Statistics.DataFrame
    
DataFrame frame
frame:Load("Data.csv")
frame:AddSelectedColumn(0)
Summarize summary = frame:Summarize()

SummarizeSelectedColumns()

This action calculates summaries of the selected columns. In this case, the full calculation objects are returned.

Return

Libraries.Containers.Array: an array of the calculations

Example

use Libraries.Compute.Statistics.DataFrame
    
DataFrame frame
frame:Load("Data.csv")
frame:AddSelectedColumnRange(0,2)
Array<Summarize> values = frame:SummarizeSelectedColumns()

ToText()

This action returns a text based representation of the data frame.

Return

text: a text based representation, in comma separated format

Example

use Libraries.Compute.Statistics.DataFrame

//Load a comma separated file
DataFrame frame
frame:Load("Data.csv")

//output the frame to the console
output frame:ToText()

ToText(integer rows)

This action returns a text based representation of the data frame.

Parameters

  • integer rows: The number of rows to get

Return

text: a text based representation, in comma separated format

Example

use Libraries.Compute.Statistics.DataFrame

//Load a comma separated file
DataFrame frame
frame:Load("Data.csv")

//output the frame to the console
output frame:ToText(10)

ToText(integer rows, boolean header)

This action returns a text based representation of the data frame.

Parameters

  • integer rows: The number of rows to get
  • boolean header: Whether or not to show the header from the frame

Return

text: a text based representation, in comma separated format

Example

use Libraries.Compute.Statistics.DataFrame

//Load a comma separated file
DataFrame frame
frame:Load("Data.csv")

//output the frame to the console
output frame:ToText(10, false)

Transform(Libraries.Compute.Statistics.DataFrameTransform transform)

This action takes the data from the current DataFrame then transforms into a copy of this data frame. While custom Transforms can choose to adjust the original, by default they do not.

Parameters

Return

Libraries.Compute.Statistics.DataFrame: Typically a copy of the DataFrame, transformed by the transformer.

Example

use Libraries.Compute.Statistics.DataFrame
use Libraries.Compute.Statistics.Transforms.ReverseTransform
use Libraries.System.File

//Load a comma separated file
DataFrame frame
frame:Load("Data.csv")

//reverse the data frame
ReverseTransform reverse
frame:Transform(reverse)

Variance()

This action calculates the sample variance of the selected column.

Return

number: the variance

Example

use Libraries.Compute.Statistics.DataFrame
    
DataFrame frame
frame:Load("Data.csv")
frame:AddSelectedColumn(0)
output frame:Variance()

VarianceSelectedColumns()

This action calculates the sample variance of the selected columns. In this case, the full calculation objects are returned.

Return

Libraries.Containers.Array: an array of the calculations

Example

use Libraries.Compute.Statistics.DataFrame
    
DataFrame frame
frame:Load("Data.csv")
frame:AddSelectedColumnRange(0,2)
Array<Variance> values = frame:VarianceSelectedColumns()

ViolinPlot()

This action creates a ViolinPlot from the current column selection in the DataFrame. By default, it uses all columns as separate values in the selection as the chart area. Factors have no impact on this action.

Return

Libraries.Interface.Controls.Charts.ViolinPlot: a chart that can be displayed or placed into a user interface or game.

Example

use Libraries.Compute.Statistics.DataFrame
use Libraries.Interface.Controls.Charts.ViolinPlot

DataFrame frame
frame:Load("Data.csv")
frame:AddSelectedColumn(1)
ViolinPlot chart = frame:ViolinPlot()
chart:SetTitle("My Awesome Title")
chart:SetXAxisTitle("Time")
chart:Display()

ViolinPlotByColumn()

This action creates a ViolinPlot from the of values of selected columns grouped by the selected factors.

Return

Libraries.Interface.Controls.Charts.ViolinPlot: a ViolinPlot chart that can be displayed or placed into a user interface or game.

Example

use Libraries.Compute.Statistics.DataFrame
use Libraries.Interface.Controls.Charts.ViolinPlot

DataFrame frame
frame:Load("Data.csv")
frame:AddSelectedColumn(0)
frame:AddSelectedFactor(3)

ViolinPlot chart = frame:ViolinPlotByColumn()
chart:Display()