## Statistical Tests Overview

Conducting tests help scientists learn more about the information they are given. In the realm of statistics and data science, statistical tests provide a mechanism for making quantitative decisions about processes. A common example is providing us specific kinds of metrics on whether to accept or reject a hypothesis.

Similar to other statistical computing software such as R, Python, and Matlab, Quorum can conduct statistical tests. In these tutorials, we will describe the tests at a high-level and provide links to the technical name sometimes used in various libraries or documentation. In all cases, statistical testing is not needed by all people using or consuming data and is generally considered an advanced topic.

As one final point for this first tutorial, consider that every major programming language uses its own naming conventions and organization, if any, for the various statistical tests. Textbooks, generally, use technical names that do not match the programming language in a direct way (e.g., sometimes abbreviated, sometimes changed). For this reason, we provide a loose list of technical textbook names and which Quorum tutorial they would correspond to:

Class in Quorum | Formal Test(s) | Purpose |
---|---|---|

CheckReducibility | Bartletts Test of Sphericity | Tests used to determine whether samples significantly correlate or relate with each other. |

CheckReducibilityStrength | Kaiser-Meyer-Olkin Measure Of Sampling Adequacy (KMO) Test | Tests used to measure how suited data is for factor analysis (finding variables that correlate highly together but do not mix and mingle with other variables outside that group). |

CompareCounts | Chi-square Goodness of Fit Test, Chi-square Test of Independence | Tests used to measure how suited data is for factor analysis (finding variables that correlate highly together but do not mix and mingle with other variables outside that group). |

CompareDistributions | Shapiro-Wilk Test | Tests used to determine whether samples are distributed normally or not. |

CompareMeans | One-Sample T-Test, Paired T-Test, Wilcoxon Signed-Ranks Test, Two-Sample T-Test, Mann-Whitney U-Test, ANOVA, Kruskal-Wallis Test, Repeated Measures ANOVA, Friedman Test | Tests used to compare whether data sets are different and in what way |

CompareVariances | Levene's Homogeneity of Variance Test, Brown-Forsythe Homogeneity of Variance Test and Mauchly's Sphericity Test | Tests used to compare different kinds of properties, like the amount of variance (spread of data). |

CompareMeansPairwise | Bonferroni Procedure, Tukey's HSD Multiple Comparison Test, Tukey-Kramer Multiple Comparison Test, Games-Howell Multiple Comparison Test | Tests used to check which group or groups failed after running a CompareGroups test first. |

CorrelateGroups | Pearson Correlation Coefficient, Spearman Correlation Coefficient | Tests used to compare whether variables have any significant relationship to each other. |

PrincipalComponentAnalysis | Principal Component Analysis | This test evaluates a set of variables and reduces them into a smaller number of them. It is often useful to figure out what variables matter in a data set. |

Regression | OLS Linear Regression | Used to determine the relationship between one dependent variable and groups of other independent variables. |

In the following tutorials, there will be examples of how to run the statistical tests in Quorum. The data used in these examples will be in a Comma Separated File (.csv). Many tests use the random.csv file that is not necessarily meaningful data but something that is simple to understand. There are some tests that use the Height of Male and Female by Country 2022.csv data that can be downloaded as well.

## Next Tutorial

In the next tutorial, we will discuss Compare Means, which describes how to compare the difference between groups .