Libraries.Compute.Statistics.Tests.Regression Documentation

This class conducts an Ordinary Least Squares regression on a DataFrame. By default, an intercept is calculated and included in the model. More information about this kind of statistical test can be found at here: https://en.wikipedia.org/wiki/Ordinary_least_squares. It was adapted from the same model in Apache Commons, but was expanded upon to simplify the library and add a variety of helper actions that were missing. More information about this class can be found on its documentation page: https://commons.apache.org/proper/commons-math/javadocs/api-3.6/org/apache/commons/math3/stat/regression/OLSMultipleLinearRegression.html

Example Code

use Libraries.Compute.Statistics.DataFrame
    use Libraries.Compute.Statistics.Columns.NumberColumn
    use Libraries.Containers.Array
    use Libraries.Compute.Statistics.DataFrameColumn
    use Libraries.Compute.Statistics.Tests.Regression

    DataFrame frame
    NumberColumn column0
    column0:SetHeader("y")
    column0:Add("1")
    column0:Add("2")
    column0:Add("3")
    column0:Add("4")
    column0:Add("5")
    column0:Add("6")

    NumberColumn column2
    column2:SetHeader("2")
    column2:Add("12.0")
    column2:Add("6")
    column2:Add("-4")
    column2:Add("1")
    column2:Add("97")
    column2:Add("65")

    NumberColumn column3
    column3:SetHeader("3")
    column3:Add("-51.0")
    column3:Add("167")
    column3:Add("24")
    column3:Add("2")
    column3:Add("120")
    column3:Add("69")

    NumberColumn column4
    column4:SetHeader("4")
    column4:Add("4")
    column4:Add("-68")
    column4:Add("-41")
    column4:Add("3")
    column4:Add("159")
    column4:Add("73")

    Array<DataFrameColumn> columns
    columns:Add(column0)
    columns:Add(column2)
    columns:Add(column3)
    columns:Add(column4)

    frame:SetColumns(columns)
    Regression regression
    regression:SetDependentVariable(0)
    regression:AddFactor(1)
    regression:AddFactor(2)
    regression:AddFactor(3)
    frame:Calculate(regression)

    //Output a series of attributes about the regression
    output "Beta: " + regression:GetCoefficients():ToText()
    output "Beta-critical values: " + regression:GetCoefficientProbabilityValues():ToText()
    output "Residuals: " + regression:GetResiduals():ToText()
    output "Residual Sum of Squared: " + regression:GetResidualSumOfSquares()
    output "Total Sum of Squared: " + regression:GetTotalSumOfSquares()
    output "F " + regression:GetCriticalValue()
    output "p = " + regression:GetProbabilityValue()
    output "R^2: " + regression:GetEffectSize()

Inherits from: Libraries.Compute.Statistics.DataFrameCalculation, Libraries.Compute.Statistics.Inputs.ColumnInput, Libraries.Language.Object, Libraries.Compute.Statistics.Inputs.FactorInput

Summary

Actions Summary Table

ActionsDescription
AddColumn(integer column)This action adds a value to the end of the input.
AddFactor(integer column)This action adds a value to the end of the input.
Calculate(Libraries.Compute.Statistics.DataFrame frame)
CalculateAdjustedEffectSize(Libraries.Compute.Matrix predictors, number r2, boolean intercept)This action calculates an adjusted effect size.
CalculateCriticalValue(Libraries.Compute.Matrix predictors, number r2)This action returns the critical value for the matrix and the given effect size (R^2).
CalculateDenominatorDegreesOfFreedom(Libraries.Compute.Matrix predictors)This calculates the degrees of freedom of the denominator in the F-ratio.
CalculateEffectSize(number residualSumOfSquares, number totalSumOfSquares)This action returns the effect size for the calculation.
CalculateErrorVariance(Libraries.Compute.Matrix predictors, Libraries.Compute.Vector residuals)This action calculates the total error variance from the residuals.
CalculateNumeratorDegreesOfFreedom(Libraries.Compute.Matrix predictors)This calculates the degrees of freedom of the numerator in the F-ratio.
CalculatePValues(Libraries.Compute.Matrix predictors, Libraries.Compute.MatrixTransform.OrthonormalTriangularDecomposition ortho, Libraries.Compute.Vector residuals, Libraries.Compute.Vector betas)This action calculates the probability values for each beta-coefficient in the model.
CalculateProbabilityValue(Libraries.Compute.Matrix predictors, number criticalValue)This action returns the probability value (p-value) for the overall regression.
CalculateRegressionParametersStandardErrors(Libraries.Compute.Matrix predictors, Libraries.Compute.MatrixTransform.OrthonormalTriangularDecomposition ortho, Libraries.Compute.Vector residuals)This action calculates the standard errors from the residuals.
CalculateResidualSumOfSquares(Libraries.Compute.Vector residuals)This action calculates the residuals.
CalculateResiduals(Libraries.Compute.Vector y, Libraries.Compute.Matrix predictors, Libraries.Compute.Vector b)This action calculates the residuals.
CalculateTotalSumOfSquares(Libraries.Compute.Statistics.DataFrameColumn column, boolean intercept)This action calculates the sum of squares for an instance of this regression.
CalculateVarianceCovarianceMatrix(Libraries.Compute.Matrix predictors, Libraries.Compute.MatrixTransform.OrthonormalTriangularDecomposition ortho)This action calculates the variance-covariance matrix
Compare(Libraries.Language.Object object)This action compares two object hash codes and returns an integer.
EmptyColumns()This action empty's the list, clearing out all of the items contained within it.
EmptyFactors()This action empty's the list, clearing out all of the items contained within it.
Equals(Libraries.Language.Object object)This action determines if two objects are equal based on their hash code values.
GetAdjustedEffectSize()Returns the total adjusted effect size, in statistics typically termed adjusted R^2 (R-squared).
GetCoefficientProbabilityValues()Returns the probability values for the beta coefficients
GetCoefficients()Returns the total beta coefficients.
GetColumn(integer index)This action gets the item at a given location in an array.
GetColumnIterator()This action gets an iterator for the object and returns that iterator.
GetColumnSize()This action gets the size of the array.
GetCriticalValue()Returns the critical value
GetEffectSize()Returns the total effect size, in statistics typically termed R^2 (R-squared).
GetFactor(integer index)This action gets the item at a given location in an array.
GetFactorIterator()This action gets an iterator for the object and returns that iterator.
GetFactorSize()This action gets the size of the array.
GetFormalSummary()This action summarizes the result and places it into formal academic language, in APA format.
GetHashCode()This action gets the hash code for an object.
GetProbabilityValue()Returns the probability value
GetResidualSumOfSquares()Returns the total residual sum of squares.
GetResiduals()Returns the residuals.
GetTotalSumOfSquares()Returns the total sum of squares.
HasIntercept()Returns whether or not this regression includes an intercept.
IsEmptyColumns()This action returns a boolean value, true if the container is empty and false if it contains any items.
IsEmptyFactors()This action returns a boolean value, true if the container is empty and false if it contains any items.
RemoveColumn(integer column)This action removes the first occurrence of an item that is found in the Addable object.
RemoveColumnAt(integer index)This action removes an item from an indexed object and returns that item.
RemoveFactor(integer column)This action removes the first occurrence of an item that is found in the Addable object.
RemoveFactorAt(integer index)This action removes an item from an indexed object and returns that item.
SetHasIntercept(boolean hasIntercept)Sets whether or not this regression includes an intercept.

Actions Documentation

AddColumn(integer column)

This action adds a value to the end of the input.

Parameters

AddFactor(integer column)

This action adds a value to the end of the input.

Parameters

Calculate(Libraries.Compute.Statistics.DataFrame frame)

Parameters

CalculateAdjustedEffectSize(Libraries.Compute.Matrix predictors, number r2, boolean intercept)

This action calculates an adjusted effect size. This adjustment accounts for the number of predictors included in the model.

Parameters

Return

number: The R^2 if the regression is calculated

CalculateCriticalValue(Libraries.Compute.Matrix predictors, number r2)

This action returns the critical value for the matrix and the given effect size (R^2). The calculation for this action is typically termed in statistics an "F-value," an esoteric way of describing the location a result rests on a distribution. We calculate this value by the following equation: R2 / (p - 1) (1 - R^2) / (n - p) While an example is included on how to calculate this value, it is complicated and we highly recommend calculating the regression and just calling GetCriticalValue instead.

Example Code

use Libraries.Compute.Statistics.DataFrame
        use Libraries.Compute.Statistics.DataFrameColumn
        use Libraries.Compute.Statistics.Tests.Regression
    
        DataFrame frame
        frame:Load("Data/Data.csv")
        Regression regression

        DataFrameColumn column = frame:GetColumn(predictedColumn)
        Vector y = column:ConvertToVector()
        Matrix predictorMatrix = transformed:ConvertToMatrix()
        OrthonormalTriangularDecomposition decomp
        decomp:Calculate(predictorMatrix)
        Vector predicted = undefined

        if column:CanConvertToVector()
            predicted = column:ConvertToVector()
        else
            return now
        end

        Vector beta = decomp:Solve(predicted)
        Vector residuals = regression:CalculateResiduals(predicted, predictorMatrix, beta)
        number residualSumOfSquares = regression:CalculateResidualSumOfSquares(residuals)
        number totalSumOfSquares = regression:CalculateTotalSumOfSquares(column, hasIntercept)

        number r2 = regression:CalculateEffectSize(residualSumOfSquares, totalSumOfSquares)
        number fValue = regression:CalculateCriticalValue(predictorMatrix, r2)
        output fValue

Parameters

Return

number: This returns the critical value, typically called "F" in statistics.

CalculateDenominatorDegreesOfFreedom(Libraries.Compute.Matrix predictors)

This calculates the degrees of freedom of the denominator in the F-ratio. It is equivalent to the number of rows in the matrix - the number of columns

Parameters

Return

number: The number of rows in the matrix - the number of columns.

CalculateEffectSize(number residualSumOfSquares, number totalSumOfSquares)

This action returns the effect size for the calculation. The technical name for this effect is "R^2" and the calculation for this is 1 - the residual sum of squares dividided by the total sum of squares.

Parameters

Return

number:

CalculateErrorVariance(Libraries.Compute.Matrix predictors, Libraries.Compute.Vector residuals)

This action calculates the total error variance from the residuals.

Parameters

Return

number:

CalculateNumeratorDegreesOfFreedom(Libraries.Compute.Matrix predictors)

This calculates the degrees of freedom of the numerator in the F-ratio. It is equivalent to the number of columns in the matrix - 1

Parameters

Return

number: The number of columns in the matrix - 1

CalculatePValues(Libraries.Compute.Matrix predictors, Libraries.Compute.MatrixTransform.OrthonormalTriangularDecomposition ortho, Libraries.Compute.Vector residuals, Libraries.Compute.Vector betas)

This action calculates the probability values for each beta-coefficient in the model. This is a complex test and should not be used externally unless you really know what you are doing.

Parameters

Return

Libraries.Compute.Vector: a vector of the probability values

CalculateProbabilityValue(Libraries.Compute.Matrix predictors, number criticalValue)

This action returns the probability value (p-value) for the overall regression. This is sometimes called an "omnibus p-value."

Example Code

use Libraries.Compute.Statistics.DataFrame
        use Libraries.Compute.Statistics.DataFrameColumn
        use Libraries.Compute.Statistics.Tests.Regression
    
        DataFrame frame
        frame:Load("Data/Data.csv")
        Matrix predictorMatrix = transformed:ConvertToMatrix()
        Regression regression
        number p = regression:CalculateProbabilityValue(predictors, 0.9)
        output p

Parameters

Return

number: The probability value.

CalculateRegressionParametersStandardErrors(Libraries.Compute.Matrix predictors, Libraries.Compute.MatrixTransform.OrthonormalTriangularDecomposition ortho, Libraries.Compute.Vector residuals)

This action calculates the standard errors from the residuals.

Parameters

Return

Libraries.Compute.Vector:

CalculateResidualSumOfSquares(Libraries.Compute.Vector residuals)

This action calculates the residuals.

Parameters

Return

number: returns the sum of squares from the residuals

CalculateResiduals(Libraries.Compute.Vector y, Libraries.Compute.Matrix predictors, Libraries.Compute.Vector b)

This action calculates the residuals.

Parameters

Return

Libraries.Compute.Vector: returns the residuals

CalculateTotalSumOfSquares(Libraries.Compute.Statistics.DataFrameColumn column, boolean intercept)

This action calculates the sum of squares for an instance of this regression. It uses a SumOfSquares if there is no Intercept and the second moment if it does.

Example Code

use Libraries.Compute.Statistics.DataFrame
        use Libraries.Compute.Statistics.DataFrameColumn
        use Libraries.Compute.Statistics.Tests.Regression
    
        DataFrame frame
        frame:Load("Data/Data.csv")

        DataFrameColumn column = frame:GetColumn("DT")

        Regression regression
        number value = regression:CalculateTotalSumOfSquares(column, false)
        output value

Parameters

Return

number: the total sum of squares

CalculateVarianceCovarianceMatrix(Libraries.Compute.Matrix predictors, Libraries.Compute.MatrixTransform.OrthonormalTriangularDecomposition ortho)

This action calculates the variance-covariance matrix

Parameters

Return

Libraries.Compute.Matrix:

Compare(Libraries.Language.Object object)

This action compares two object hash codes and returns an integer. The result is larger if this hash code is larger than the object passed as a parameter, smaller, or equal. In this case, -1 means smaller, 0 means equal, and 1 means larger. This action was changed in Quorum 7 to return an integer, instead of a CompareResult object, because the previous implementation was causing efficiency issues.

Example Code

Object o
        Object t
        integer result = o:Compare(t) //1 (larger), 0 (equal), or -1 (smaller)

Parameters

Return

integer: The Compare result, Smaller, Equal, or Larger.

EmptyColumns()

This action empty's the list, clearing out all of the items contained within it.

EmptyFactors()

This action empty's the list, clearing out all of the items contained within it.

Equals(Libraries.Language.Object object)

This action determines if two objects are equal based on their hash code values.

Example Code

use Libraries.Language.Object
        use Libraries.Language.Types.Text
        Object o
        Text t
        boolean result = o:Equals(t)

Parameters

Return

boolean: True if the hash codes are equal and false if they are not equal.

GetAdjustedEffectSize()

Returns the total adjusted effect size, in statistics typically termed adjusted R^2 (R-squared). This action returns 0 unless the regression has been calculated.

Return

number: The adjusted R^2 if the regression is calculated

GetCoefficientProbabilityValues()

Returns the probability values for the beta coefficients

Return

Libraries.Compute.Vector:

GetCoefficients()

Returns the total beta coefficients. This action returns 0 unless the regression has been calculated.

Return

Libraries.Compute.Vector: The beta coefficients

GetColumn(integer index)

This action gets the item at a given location in an array.

Parameters

Return

integer: The item at the given location.

GetColumnIterator()

This action gets an iterator for the object and returns that iterator.

Return

Libraries.Containers.Iterator: Returns the iterator for an object.

GetColumnSize()

This action gets the size of the array.

Return

integer:

GetCriticalValue()

Returns the critical value

Return

number:

GetEffectSize()

Returns the total effect size, in statistics typically termed R^2 (R-squared). This action returns 0 unless the regression has been calculated.

Return

number: The R^2 if the regression is calculated

GetFactor(integer index)

This action gets the item at a given location in an array.

Parameters

Return

integer: The item at the given location.

GetFactorIterator()

This action gets an iterator for the object and returns that iterator.

Return

Libraries.Containers.Iterator: Returns the iterator for an object.

GetFactorSize()

This action gets the size of the array.

Return

integer:

GetFormalSummary()

This action summarizes the result and places it into formal academic language, in APA format.

Return

text:

GetHashCode()

This action gets the hash code for an object.

Example Code

Object o
        integer hash = o:GetHashCode()

Return

integer: The integer hash code of the object.

GetProbabilityValue()

Returns the probability value

Return

number:

GetResidualSumOfSquares()

Returns the total residual sum of squares. This action returns 0 unless the regression has been calculated.

Return

number: The residual sum of squares

GetResiduals()

Returns the residuals. This action returns 0 unless the regression has been calculated.

Return

Libraries.Compute.Vector: The residuals

GetTotalSumOfSquares()

Returns the total sum of squares. This action returns 0 unless the regression has been calculated.

Return

number: The total sum of squares

HasIntercept()

Returns whether or not this regression includes an intercept.

Return

boolean:

IsEmptyColumns()

This action returns a boolean value, true if the container is empty and false if it contains any items.

Return

boolean: Returns true when the container is empty and false when it is not.

IsEmptyFactors()

This action returns a boolean value, true if the container is empty and false if it contains any items.

Return

boolean: Returns true when the container is empty and false when it is not.

RemoveColumn(integer column)

This action removes the first occurrence of an item that is found in the Addable object.

Parameters

Return

boolean: Returns true if the item was removed and false if it was not removed.

RemoveColumnAt(integer index)

This action removes an item from an indexed object and returns that item.

Parameters

RemoveFactor(integer column)

This action removes the first occurrence of an item that is found in the Addable object.

Parameters

Return

boolean: Returns true if the item was removed and false if it was not removed.

RemoveFactorAt(integer index)

This action removes an item from an indexed object and returns that item.

Parameters

SetHasIntercept(boolean hasIntercept)

Sets whether or not this regression includes an intercept.

Parameters