Libraries.Compute.Statistics.Tests.Regression Documentation
This class conducts an Ordinary Least Squares regression on a DataFrame. By default, an intercept is calculated and included in the model. More information about this kind of statistical test can be found at here: https://en.wikipedia.org/wiki/Ordinary_least_squares. It was adapted from the same model in Apache Commons, but was expanded upon to simplify the library and add a variety of helper actions that were missing. More information about this class can be found on its documentation page: https://commons.apache.org/proper/commons-math/javadocs/api-3.6/org/apache/commons/math3/stat/regression/OLSMultipleLinearRegression.html
Example Code
use Libraries.Compute.Statistics.DataFrame
use Libraries.Compute.Statistics.Columns.NumberColumn
use Libraries.Containers.Array
use Libraries.Compute.Statistics.DataFrameColumn
use Libraries.Compute.Statistics.Tests.Regression
DataFrame frame
NumberColumn column0
column0:SetHeader("y")
column0:Add("1")
column0:Add("2")
column0:Add("3")
column0:Add("4")
column0:Add("5")
column0:Add("6")
NumberColumn column2
column2:SetHeader("2")
column2:Add("12.0")
column2:Add("6")
column2:Add("-4")
column2:Add("1")
column2:Add("97")
column2:Add("65")
NumberColumn column3
column3:SetHeader("3")
column3:Add("-51.0")
column3:Add("167")
column3:Add("24")
column3:Add("2")
column3:Add("120")
column3:Add("69")
NumberColumn column4
column4:SetHeader("4")
column4:Add("4")
column4:Add("-68")
column4:Add("-41")
column4:Add("3")
column4:Add("159")
column4:Add("73")
Array<DataFrameColumn> columns
columns:Add(column0)
columns:Add(column2)
columns:Add(column3)
columns:Add(column4)
frame:SetColumns(columns)
Regression regression
regression:SetDependentVariable(0)
regression:AddFactor(1)
regression:AddFactor(2)
regression:AddFactor(3)
frame:Calculate(regression)
//Output a series of attributes about the regression
output "Beta: " + regression:GetCoefficients():ToText()
output "Beta-critical values: " + regression:GetCoefficientProbabilityValues():ToText()
output "Residuals: " + regression:GetResiduals():ToText()
output "Residual Sum of Squared: " + regression:GetResidualSumOfSquares()
output "Total Sum of Squared: " + regression:GetTotalSumOfSquares()
output "F " + regression:GetCriticalValue()
output "p = " + regression:GetProbabilityValue()
output "R^2: " + regression:GetEffectSize()
Inherits from: Libraries.Compute.Statistics.DataFrameCalculation, Libraries.Compute.Statistics.Inputs.ColumnInput, Libraries.Language.Object, Libraries.Compute.Statistics.Inputs.FactorInput
Summary
Actions Summary Table
Actions | Description |
---|---|
AddColumn(integer column) | This action adds a value to the end of the input. |
AddFactor(integer column) | This action adds a value to the end of the input. |
Calculate(Libraries.Compute.Statistics.DataFrame frame) | |
CalculateAdjustedEffectSize(Libraries.Compute.Matrix predictors, number r2, boolean intercept) | This action calculates an adjusted effect size. |
CalculateCriticalValue(Libraries.Compute.Matrix predictors, number r2) | This action returns the critical value for the matrix and the given effect size (R^2). |
CalculateDenominatorDegreesOfFreedom(Libraries.Compute.Matrix predictors) | This calculates the degrees of freedom of the denominator in the F-ratio. |
CalculateEffectSize(number residualSumOfSquares, number totalSumOfSquares) | This action returns the effect size for the calculation. |
CalculateErrorVariance(Libraries.Compute.Matrix predictors, Libraries.Compute.Vector residuals) | This action calculates the total error variance from the residuals. |
CalculateNumeratorDegreesOfFreedom(Libraries.Compute.Matrix predictors) | This calculates the degrees of freedom of the numerator in the F-ratio. |
CalculatePValues(Libraries.Compute.Matrix predictors, Libraries.Compute.MatrixTransform.OrthonormalTriangularDecomposition ortho, Libraries.Compute.Vector residuals, Libraries.Compute.Vector betas) | This action calculates the probability values for each beta-coefficient in the model. |
CalculateProbabilityValue(Libraries.Compute.Matrix predictors, number criticalValue) | This action returns the probability value (p-value) for the overall regression. |
CalculateRegressionParametersStandardErrors(Libraries.Compute.Matrix predictors, Libraries.Compute.MatrixTransform.OrthonormalTriangularDecomposition ortho, Libraries.Compute.Vector residuals) | This action calculates the standard errors from the residuals. |
CalculateResidualSumOfSquares(Libraries.Compute.Vector residuals) | This action calculates the residuals. |
CalculateResiduals(Libraries.Compute.Vector y, Libraries.Compute.Matrix predictors, Libraries.Compute.Vector b) | This action calculates the residuals. |
CalculateTotalSumOfSquares(Libraries.Compute.Statistics.DataFrameColumn column, boolean intercept) | This action calculates the sum of squares for an instance of this regression. |
CalculateVarianceCovarianceMatrix(Libraries.Compute.Matrix predictors, Libraries.Compute.MatrixTransform.OrthonormalTriangularDecomposition ortho) | This action calculates the variance-covariance matrix |
Compare(Libraries.Language.Object object) | This action compares two object hash codes and returns an integer. |
EmptyColumns() | This action empty's the list, clearing out all of the items contained within it. |
EmptyFactors() | This action empty's the list, clearing out all of the items contained within it. |
Equals(Libraries.Language.Object object) | This action determines if two objects are equal based on their hash code values. |
GetAdjustedEffectSize() | Returns the total adjusted effect size, in statistics typically termed adjusted R^2 (R-squared). |
GetCoefficientProbabilityValues() | Returns the probability values for the beta coefficients |
GetCoefficients() | Returns the total beta coefficients. |
GetColumn(integer index) | This action gets the item at a given location in an array. |
GetColumnIterator() | This action gets an iterator for the object and returns that iterator. |
GetColumnSize() | This action gets the size of the array. |
GetCriticalValue() | Returns the critical value |
GetEffectSize() | Returns the total effect size, in statistics typically termed R^2 (R-squared). |
GetFactor(integer index) | This action gets the item at a given location in an array. |
GetFactorIterator() | This action gets an iterator for the object and returns that iterator. |
GetFactorSize() | This action gets the size of the array. |
GetFormalSummary() | This action summarizes the result and places it into formal academic language, in APA format. |
GetHashCode() | This action gets the hash code for an object. |
GetProbabilityValue() | Returns the probability value |
GetResidualSumOfSquares() | Returns the total residual sum of squares. |
GetResiduals() | Returns the residuals. |
GetTotalSumOfSquares() | Returns the total sum of squares. |
HasIntercept() | Returns whether or not this regression includes an intercept. |
IsEmptyColumns() | This action returns a boolean value, true if the container is empty and false if it contains any items. |
IsEmptyFactors() | This action returns a boolean value, true if the container is empty and false if it contains any items. |
RemoveColumn(integer column) | This action removes the first occurrence of an item that is found in the Addable object. |
RemoveColumnAt(integer index) | This action removes an item from an indexed object and returns that item. |
RemoveFactor(integer column) | This action removes the first occurrence of an item that is found in the Addable object. |
RemoveFactorAt(integer index) | This action removes an item from an indexed object and returns that item. |
SetHasIntercept(boolean hasIntercept) | Sets whether or not this regression includes an intercept. |
Actions Documentation
AddColumn(integer column)
This action adds a value to the end of the input.
Parameters
AddFactor(integer column)
This action adds a value to the end of the input.
Parameters
Calculate(Libraries.Compute.Statistics.DataFrame frame)
Parameters
CalculateAdjustedEffectSize(Libraries.Compute.Matrix predictors, number r2, boolean intercept)
This action calculates an adjusted effect size. This adjustment accounts for the number of predictors included in the model.
Parameters
- Libraries.Compute.Matrix
- number r2: the effect size
- boolean intercept: whether the regression includes an intercept.
Return
number: The R^2 if the regression is calculated
CalculateCriticalValue(Libraries.Compute.Matrix predictors, number r2)
This action returns the critical value for the matrix and the given effect size (R^2). The calculation for this action is typically termed in statistics an "F-value," an esoteric way of describing the location a result rests on a distribution. We calculate this value by the following equation: R2 / (p - 1) (1 - R^2) / (n - p) While an example is included on how to calculate this value, it is complicated and we highly recommend calculating the regression and just calling GetCriticalValue instead.
Example Code
use Libraries.Compute.Statistics.DataFrame
use Libraries.Compute.Statistics.DataFrameColumn
use Libraries.Compute.Statistics.Tests.Regression
DataFrame frame
frame:Load("Data/Data.csv")
Regression regression
DataFrameColumn column = frame:GetColumn(predictedColumn)
Vector y = column:ConvertToVector()
Matrix predictorMatrix = transformed:ConvertToMatrix()
OrthonormalTriangularDecomposition decomp
decomp:Calculate(predictorMatrix)
Vector predicted = undefined
if column:CanConvertToVector()
predicted = column:ConvertToVector()
else
return now
end
Vector beta = decomp:Solve(predicted)
Vector residuals = regression:CalculateResiduals(predicted, predictorMatrix, beta)
number residualSumOfSquares = regression:CalculateResidualSumOfSquares(residuals)
number totalSumOfSquares = regression:CalculateTotalSumOfSquares(column, hasIntercept)
number r2 = regression:CalculateEffectSize(residualSumOfSquares, totalSumOfSquares)
number fValue = regression:CalculateCriticalValue(predictorMatrix, r2)
output fValue
Parameters
- Libraries.Compute.Matrix: The matrix of predictor values.
Return
number: This returns the critical value, typically called "F" in statistics.
CalculateDenominatorDegreesOfFreedom(Libraries.Compute.Matrix predictors)
This calculates the degrees of freedom of the denominator in the F-ratio. It is equivalent to the number of rows in the matrix - the number of columns
Parameters
- Libraries.Compute.Matrix: a matrix to calculate on
Return
number: The number of rows in the matrix - the number of columns.
CalculateEffectSize(number residualSumOfSquares, number totalSumOfSquares)
This action returns the effect size for the calculation. The technical name for this effect is "R^2" and the calculation for this is 1 - the residual sum of squares dividided by the total sum of squares.
Parameters
- number residualSumOfSquares: The residual sum of squares
- number totalSumOfSquares: The total sum of squares
Return
number:
CalculateErrorVariance(Libraries.Compute.Matrix predictors, Libraries.Compute.Vector residuals)
This action calculates the total error variance from the residuals.
Parameters
- Libraries.Compute.Matrix: The matrix of values
- Libraries.Compute.Vector: The errors in the model
Return
number:
CalculateNumeratorDegreesOfFreedom(Libraries.Compute.Matrix predictors)
This calculates the degrees of freedom of the numerator in the F-ratio. It is equivalent to the number of columns in the matrix - 1
Parameters
- Libraries.Compute.Matrix: a matrix to calculate on
Return
number: The number of columns in the matrix - 1
CalculatePValues(Libraries.Compute.Matrix predictors, Libraries.Compute.MatrixTransform.OrthonormalTriangularDecomposition ortho, Libraries.Compute.Vector residuals, Libraries.Compute.Vector betas)
This action calculates the probability values for each beta-coefficient in the model. This is a complex test and should not be used externally unless you really know what you are doing.
Parameters
- Libraries.Compute.Matrix: The matrix of values
- Libraries.Compute.MatrixTransform.OrthonormalTriangularDecomposition: A decomposition object for the matrix
- Libraries.Compute.Vector: The errors in the model
- Libraries.Compute.Vector: The coefficients
Return
Libraries.Compute.Vector: a vector of the probability values
CalculateProbabilityValue(Libraries.Compute.Matrix predictors, number criticalValue)
This action returns the probability value (p-value) for the overall regression. This is sometimes called an "omnibus p-value."
Example Code
use Libraries.Compute.Statistics.DataFrame
use Libraries.Compute.Statistics.DataFrameColumn
use Libraries.Compute.Statistics.Tests.Regression
DataFrame frame
frame:Load("Data/Data.csv")
Matrix predictorMatrix = transformed:ConvertToMatrix()
Regression regression
number p = regression:CalculateProbabilityValue(predictors, 0.9)
output p
Parameters
- Libraries.Compute.Matrix: The matrix of predictor values.
Return
number: The probability value.
CalculateRegressionParametersStandardErrors(Libraries.Compute.Matrix predictors, Libraries.Compute.MatrixTransform.OrthonormalTriangularDecomposition ortho, Libraries.Compute.Vector residuals)
This action calculates the standard errors from the residuals.
Parameters
- Libraries.Compute.Matrix: The matrix of values
- Libraries.Compute.MatrixTransform.OrthonormalTriangularDecomposition: A decomposition object for the matrix
- Libraries.Compute.Vector: The errors in the model
Return
CalculateResidualSumOfSquares(Libraries.Compute.Vector residuals)
This action calculates the residuals.
Parameters
- Libraries.Compute.Vector: The residuals
Return
number: returns the sum of squares from the residuals
CalculateResiduals(Libraries.Compute.Vector y, Libraries.Compute.Matrix predictors, Libraries.Compute.Vector b)
This action calculates the residuals.
Parameters
- Libraries.Compute.Vector: The matrix of values
- Libraries.Compute.Matrix: The matrix of values
- Libraries.Compute.Vector: The beta coefficients
Return
Libraries.Compute.Vector: returns the residuals
CalculateTotalSumOfSquares(Libraries.Compute.Statistics.DataFrameColumn column, boolean intercept)
This action calculates the sum of squares for an instance of this regression. It uses a SumOfSquares if there is no Intercept and the second moment if it does.
Example Code
use Libraries.Compute.Statistics.DataFrame
use Libraries.Compute.Statistics.DataFrameColumn
use Libraries.Compute.Statistics.Tests.Regression
DataFrame frame
frame:Load("Data/Data.csv")
DataFrameColumn column = frame:GetColumn("DT")
Regression regression
number value = regression:CalculateTotalSumOfSquares(column, false)
output value
Parameters
- Libraries.Compute.Statistics.DataFrameColumn: The column we are calculating the total from
- boolean intercept: whether or not the regression has an intercept
Return
number: the total sum of squares
CalculateVarianceCovarianceMatrix(Libraries.Compute.Matrix predictors, Libraries.Compute.MatrixTransform.OrthonormalTriangularDecomposition ortho)
This action calculates the variance-covariance matrix
Parameters
- Libraries.Compute.Matrix: The matrix of values
- Libraries.Compute.MatrixTransform.OrthonormalTriangularDecomposition: An already calculated decomposition
Return
Compare(Libraries.Language.Object object)
This action compares two object hash codes and returns an integer. The result is larger if this hash code is larger than the object passed as a parameter, smaller, or equal. In this case, -1 means smaller, 0 means equal, and 1 means larger. This action was changed in Quorum 7 to return an integer, instead of a CompareResult object, because the previous implementation was causing efficiency issues.
Example Code
Object o
Object t
integer result = o:Compare(t) //1 (larger), 0 (equal), or -1 (smaller)
Parameters
- Libraries.Language.Object: The object to compare to.
Return
integer: The Compare result, Smaller, Equal, or Larger.
EmptyColumns()
This action empty's the list, clearing out all of the items contained within it.
EmptyFactors()
This action empty's the list, clearing out all of the items contained within it.
Equals(Libraries.Language.Object object)
This action determines if two objects are equal based on their hash code values.
Example Code
use Libraries.Language.Object
use Libraries.Language.Types.Text
Object o
Text t
boolean result = o:Equals(t)
Parameters
- Libraries.Language.Object: The to be compared.
Return
boolean: True if the hash codes are equal and false if they are not equal.
GetAdjustedEffectSize()
Returns the total adjusted effect size, in statistics typically termed adjusted R^2 (R-squared). This action returns 0 unless the regression has been calculated.
Return
number: The adjusted R^2 if the regression is calculated
GetCoefficientProbabilityValues()
Returns the probability values for the beta coefficients
Return
GetCoefficients()
Returns the total beta coefficients. This action returns 0 unless the regression has been calculated.
Return
Libraries.Compute.Vector: The beta coefficients
GetColumn(integer index)
This action gets the item at a given location in an array.
Parameters
Return
integer: The item at the given location.
GetColumnIterator()
This action gets an iterator for the object and returns that iterator.
Return
Libraries.Containers.Iterator: Returns the iterator for an object.
GetColumnSize()
This action gets the size of the array.
Return
integer:
GetCriticalValue()
Returns the critical value
Return
number:
GetEffectSize()
Returns the total effect size, in statistics typically termed R^2 (R-squared). This action returns 0 unless the regression has been calculated.
Return
number: The R^2 if the regression is calculated
GetFactor(integer index)
This action gets the item at a given location in an array.
Parameters
Return
integer: The item at the given location.
GetFactorIterator()
This action gets an iterator for the object and returns that iterator.
Return
Libraries.Containers.Iterator: Returns the iterator for an object.
GetFactorSize()
This action gets the size of the array.
Return
integer:
GetFormalSummary()
This action summarizes the result and places it into formal academic language, in APA format.
Return
text:
GetHashCode()
This action gets the hash code for an object.
Example Code
Object o
integer hash = o:GetHashCode()
Return
integer: The integer hash code of the object.
GetProbabilityValue()
Returns the probability value
Return
number:
GetResidualSumOfSquares()
Returns the total residual sum of squares. This action returns 0 unless the regression has been calculated.
Return
number: The residual sum of squares
GetResiduals()
Returns the residuals. This action returns 0 unless the regression has been calculated.
Return
Libraries.Compute.Vector: The residuals
GetTotalSumOfSquares()
Returns the total sum of squares. This action returns 0 unless the regression has been calculated.
Return
number: The total sum of squares
HasIntercept()
Returns whether or not this regression includes an intercept.
Return
boolean:
IsEmptyColumns()
This action returns a boolean value, true if the container is empty and false if it contains any items.
Return
boolean: Returns true when the container is empty and false when it is not.
IsEmptyFactors()
This action returns a boolean value, true if the container is empty and false if it contains any items.
Return
boolean: Returns true when the container is empty and false when it is not.
RemoveColumn(integer column)
This action removes the first occurrence of an item that is found in the Addable object.
Parameters
Return
boolean: Returns true if the item was removed and false if it was not removed.
RemoveColumnAt(integer index)
This action removes an item from an indexed object and returns that item.
Parameters
RemoveFactor(integer column)
This action removes the first occurrence of an item that is found in the Addable object.
Parameters
Return
boolean: Returns true if the item was removed and false if it was not removed.
RemoveFactorAt(integer index)
This action removes an item from an indexed object and returns that item.
Parameters
SetHasIntercept(boolean hasIntercept)
Sets whether or not this regression includes an intercept.