Reduce Large Sets of Data For Analysis

Technical Name: Principal Component Analysis

The PrincipalComponentAnalysis (PCA) class in Quorum is a procedure in statistics that attempts to condense or summarize the information explained by large datasets (i.e. a large number of factors or variables) into a smaller set of components that can be easier to analyze or visualize.

One example would be figuring out what has caused cancer (generally). There can be many factors like genetics, diet, exposure, location, and so on which can be hard to understand. Using Principal Component Analysis, we can try to summarize these factors into a smaller number of components that will explain the variance between all of the factors and expose the key factors related to cancer.

It finds all the important information that could be related to what caused the cancer by looking at all the factors and figuring which ones are most different from each other. It then combines all those factors into a couple of new ones, called principal components, that get all the most important information.


Factors have a habit of loading on lots of variables, so to help us interpret them, we can "rotate"the results. This can maximize or minimize the loading of a particular variable on a particular factor.

Rotations for PCA
Rotations Purpose Function Call
Uncorrelated Component RotationOrthogonal Varimax Rotation - keeps factors unrelatedUseUncorrelatedRotation()
Correlated Component RotationOblique Direct Quartimin Rotation - allows factors to be correlatedUseCorrelatedRotation()

Customize Rotations

Use the functions below to customize the rotations.

Custom Functions for the PCA Class
Functions Purpose
Normalize()Signifies if a Kaiser normalization takes place on the loading before the rotation. Resets to un-normalized afterwards.
SetEpsilon()Signifies the convergence threshold (the percentage variance used to define key metrics as converged) compared to the difference of the criterion for each iteration in the rotation algorithm
SetMaximumIterations()Signifies the stopping point even if convergence is not reached during a rotation

An example of the PCA test implemented in Quorum is below.

Example of a Principal Component Analysis

Next Tutorial

In the next tutorial, we will discuss Regression, which describes how to predict the outcome.