## Reduce Large Sets of Data For Analysis

### Technical Name: Principal Component Analysis

The PrincipalComponentAnalysis (PCA) class in Quorum is a procedure in statistics that attempts to condense or summarize the information explained by large datasets (i.e. a large number of factors or variables) into a smaller set of components that can be easier to analyze or visualize.

One example would be figuring out what has caused cancer (generally). There can be many factors like genetics, diet, exposure, location, and so on which can be hard to understand. Using Principal Component Analysis, we can try to summarize these factors into a smaller number of components that will explain the variance between all of the factors and expose the key factors related to cancer.

It finds all the important information that could be related to what caused the cancer by looking at all the factors and figuring which ones are most different from each other. It then combines all those factors into a couple of new ones, called principal components, that get all the most important information.

### Rotations

Factors have a habit of loading on lots of variables, so to help us interpret them, we can "rotate"the results. This can maximize or minimize the loading of a particular variable on a particular factor.

Rotations | Purpose | Function Call |
---|---|---|

Uncorrelated Component Rotation | Orthogonal Varimax Rotation - keeps factors unrelated | UseUncorrelatedRotation() |

Correlated Component Rotation | Oblique Direct Quartimin Rotation - allows factors to be correlated | UseCorrelatedRotation() |

### Customize Rotations

Use the functions below to customize the rotations.

Functions | Purpose |
---|---|

Normalize() | Signifies if a Kaiser normalization takes place on the loading before the rotation. Resets to un-normalized afterwards. |

SetEpsilon() | Signifies the convergence threshold (the percentage variance used to define key metrics as converged) compared to the difference of the criterion for each iteration in the rotation algorithm |

SetMaximumIterations() | Signifies the stopping point even if convergence is not reached during a rotation |

An example of the PCA test implemented in Quorum is below.

Example of a Principal Component Analysis

## Code Area

## Output Area

## Next Tutorial

In the next tutorial, we will discuss Regression, which describes how to predict the outcome.