Libraries.Compute.Statistics.Clustering.ClusterByMeans

Libraries.Compute

Libraries.Compute.Statistics

Libraries.Compute.Statistics.Analysis

Libraries.Compute.Statistics.Calculations

Libraries.Compute.Statistics.Charts

Libraries.Compute.Statistics.Clustering

Libraries.Compute.Statistics.Columns

Libraries.Compute.Statistics.DateTimeParsers

Libraries.Compute.Statistics.Distributions

Libraries.Compute.Statistics.Inputs

Libraries.Compute.Statistics.Loaders

Libraries.Compute.Statistics.Predictions

RegressionPrediction

Libraries.Compute.Statistics.Reporting

Libraries.Compute.Statistics.Tests

Libraries.Compute.Statistics.WindowingActions

Libraries.Containers

Libraries.Containers.Support

Libraries.Curriculum.AudioGame

Libraries.Curriculum.TurtleProgram

Libraries.Data.Compression

Decompresser

Libraries.Data.Database.Strategies

DefaultDatabaseStrategy

Libraries.Data.Database.Support

Libraries.Data.Formats.ScalableVectorGraphics

Libraries.Game.Collision.Shapes

Libraries.Game.Graphics

Libraries.Game.Graphics.ModelData

Libraries.Game.Graphics.ModelLoaders

Libraries.Game.Graphics.ModelLoaders.WavefrontObject

Libraries.Game.Graphics.Shaders

Libraries.Game.Graphics.Triangulation

Libraries.Game.Physics

Libraries.Game.Physics.Joints

Libraries.Game.Scenes

Libraries.Game.Shapes

Rectangle

Libraries.Interface

Libraries.Interface.Accessibility

Libraries.Interface.Behaviors

Libraries.Interface.Behaviors.Block

BlockPaletteVariableListBehavior

Libraries.Interface.Behaviors.Blocks

Libraries.Interface.Behaviors.Controls

Libraries.Interface.Behaviors.Scenes.Camera

Libraries.Interface.Behaviors.Scenes.Controls

Libraries.Interface.Behaviors.Scenes.Grid

Libraries.Interface.Behaviors.Scenes.Palette

Libraries.Interface.Behaviors.Scenes.Properties

Libraries.Interface.Behaviors.Scenes.Selection

Libraries.Interface.Controls.Blocks.Palette

Libraries.Interface.Controls.Charts.Displays

ChartDisplayer

Libraries.Interface.Controls.Charts.Flowcharts

Libraries.Interface.Controls.Charts.Graphics

Libraries.Interface.Controls.Layouts

Libraries.Interface.Controls.Scenes

Libraries.Interface.Controls.Scenes.Assets

AssetPack

Libraries.Interface.Controls.Scenes.Blueprints

Libraries.Interface.Controls.Scenes.Dialogs

Libraries.Interface.Controls.Scenes.Items

Libraries.Interface.Controls.Scenes.Items.Highlights

Libraries.Interface.Controls.Scenes.Items.Previews

Libraries.Interface.Controls.Scenes.Layers

Libraries.Interface.Controls.Support

Libraries.Interface.Controls.TextStyles

Libraries.Interface.Events

Libraries.Interface.Forms

Libraries.Interface.Mobile

Libraries.Interface.Options

Libraries.Interface.Pages

StackedRowPage

Libraries.Interface.Selections

Libraries.Interface.Undo

Libraries.Interface.Vibration

Libraries.Interface.Views

Libraries.Language

Object

Libraries.Language.Compile.Blocks

Libraries.Language.Compile.Context

Libraries.Language.Compile.Documentation

Libraries.Language.Compile.Hints

Libraries.Language.Compile.Interpreter

Libraries.Language.Compile.Parsing

Libraries.Language.Compile.Symbol

Libraries.Language.Compile.Translate

Libraries.Language.Debug

Libraries.Language.Errors

Libraries.Language.Support

Libraries.Language.Types

Libraries.Network

Libraries.Robots.BirdBrain

Libraries.Robots.Lego

Libraries.Robots.Spike

Libraries.Science.Astronomy

Libraries.Sound

Libraries.System

Libraries.System.Blueprints

Libraries.System.Logging

Libraries.Testing

Libraries.Web

Libraries.Web.Page

Libraries.Compute.Statistics.Clustering.ClusterByMeans Documentation

This class represents an approach to clustering data, similar to the KMeans++ algorithm. The original code was adapted from Apache Commons Math:https://commons.apache.org/proper/commons-math/download_math.cgi. As a TODO, there are other optimizations and features that are either not included from the original or that probably should be included. First, there are optimizations that exist for KMeans++ to reduce the initialization efforts: http://vldb.org/pvldb/vol5/p622_bahmanbahmani_vldb2012.pdf. These have not been included. Second, this implementation needs to be made more flexible to include competing strategies for empty clusters and alternative distance computations. Currently, only Euclidian distance is included. This implementation requires that any of the values in the DataFrame must be an integer or a number, or convertable as such, and that no undefined values exist. In either case, an error will be thrown when the algorithm processes the values. It also requires the cluster count to be greater than 0 and the number of clusters must be strictly less than the number of rows.

Example Code

use Libraries.Compute.Statistics.DataFrame
use Libraries.Compute.Statistics.Clustering.ClusterByMeans

//make a DataFrame and toss some data in it
DataFrame frame
frame:LoadFromCommaSeparatedValue(
    "X,Y
    1,2
    2,4
    3,6
    4,8
    5,10
    9,18
    10,20
    11,22
    12,24
    13,26"
)
//set the range of points on which we will calculate distance
frame:AddSelectedColumnRange(0,1)

output "Calculating K-means Clustering"
ClusterByMeans means
means:SetClustersSize(3)

//Clusters return an additional column with labels, so they can 
//be included in a chart or other approach, per point
//If we want this in the DataFrame, we need to add it manually
ClusterResult result = means:Cluster(frame)
Array<Cluster> value = result:GetClusters()
IntegerColumn assignments = result:GetClusterIndices()
assignments:SetHeader("Clusters")
frame:AddColumn(assignments) 

//we can also chart the clusters if
//we specify that the new clusters are a factor
frame:AddSelectedFactor(2)
ScatterPlot chart = frame:ScatterPlot()
chart:SetTitle("K-Means Clustering Demo")
chart:Display()

Inherits from: Libraries.Language.Object

Actions Documentation

Cluster(Libraries.Compute.Statistics.DataFrame frame)

This example states to cluster the DataFrame, with the particular selected columns, without taking any factors into account. By default, 3 clusters are selected and this value needs to be modified using SetClusterSize if a different number is desired.

Parameters

Libraries.Compute.Statistics.DataFrame: The DataFrame we want to do our calculations on.

Return

Libraries.Compute.Statistics.Clustering.ClusterResult:

Example

use Libraries.Compute.Statistics.DataFrame
use Libraries.Compute.Statistics.Clustering.ClusterByMeans
    
//make a DataFrame and toss some data in it
DataFrame frame
frame:LoadFromCommaSeparatedValue(
    "X,Y
    1,2
    2,4
    3,6
    4,8
    5,10
    9,18
    10,20
    11,22
    12,24
    13,26"
)
//set the range of points on which we will calculate distance
frame:AddSelectedColumnRange(0,1)
    
output "Calculating K-means Clustering"
ClusterByMeans means
means:SetClustersSize(3)
    
//Clusters return an additional column with labels, so they can 
//be included in a chart or other approach, per point
//If we want this in the DataFrame, we need to add it manually
ClusterResult result = means:Cluster(frame)
Array<Cluster> value = result:GetClusters()
IntegerColumn assignments = result:GetClusterIndices()
assignments:SetHeader("Clusters")
frame:AddColumn(assignments) 

//we can also chart the clusters if
//we specify that the new clusters are a factor
frame:AddSelectedFactor(2)
ScatterPlot chart = frame:ScatterPlot()
chart:SetTitle("K-Means Clustering Demo")
chart:Display()

Cluster(Libraries.Compute.Statistics.DataFrame frame, integer seed)

Parameters

Libraries.Compute.Statistics.DataFrame: The DataFrame we want to do our calculations on.
integer seed: A set seed for the clustering.

Return

Libraries.Compute.Statistics.Clustering.ClusterResult:

Example

use Libraries.Compute.Statistics.DataFrame
use Libraries.Compute.Statistics.Clustering.ClusterByMeans
    
//make a DataFrame and toss some data in it
DataFrame frame
frame:LoadFromCommaSeparatedValue(
    "X,Y
    1,2
    2,4
    3,6
    4,8
    5,10
    9,18
    10,20
    11,22
    12,24
    13,26"
)
//set the range of points on which we will calculate distance
frame:AddSelectedColumnRange(0,1)
    
output "Calculating K-means Clustering"
ClusterByMeans means
means:SetClustersSize(3)
    
//Clusters return an additional column with labels, so they can 
//be included in a chart or other approach, per point
//If we want this in the DataFrame, we need to add it manually
ClusterResult result = means:Cluster(frame, 42)
Array<Cluster> value = result:GetClusters()
IntegerColumn assignments = result:GetClusterIndices()
assignments:SetHeader("Clusters")
frame:AddColumn(assignments) 

//we can also chart the clusters if
//we specify that the new clusters are a factor
frame:AddSelectedFactor(2)
ScatterPlot chart = frame:ScatterPlot()
chart:SetTitle("K-Means Clustering Demo")
chart:Display()

Compare(Libraries.Language.Object object)

This action compares two object hash codes and returns an integer. The result is larger if this hash code is larger than the object passed as a parameter, smaller, or equal. In this case, -1 means smaller, 0 means equal, and 1 means larger. This action was changed in Quorum 7 to return an integer, instead of a CompareResult object, because the previous implementation was causing efficiency issues.

Parameters

Libraries.Language.Object: The object to compare to.

Return

integer: The Compare result, Smaller, Equal, or Larger.

Example

Object o
Object t
integer result = o:Compare(t) //1 (larger), 0 (equal), or -1 (smaller)

Equals(Libraries.Language.Object object)

This action determines if two objects are equal based on their hash code values.

Parameters

Libraries.Language.Object: The to be compared.

Return

boolean: True if the hash codes are equal and false if they are not equal.

Example

use Libraries.Language.Object
use Libraries.Language.Types.Text
Object o
Text t
boolean result = o:Equals(t)

GetClustersSize()

This returns the number of clusters expected when the algorithm has finished. The default is 3.

Return

integer:

GetHashCode()

This action gets the hash code for an object.

Return

integer: The integer hash code of the object.

Example

Object o
integer hash = o:GetHashCode()

SetClustersSize(integer amount)

This sets the number of clusters expected when the algorithm has finished. The default is 3.

Parameters

integer amount

Libraries.Compute

Libraries.Compute.MatrixTransform

Libraries.Compute.Statistics

Libraries.Compute.Statistics.Analysis

Libraries.Compute.Statistics.Calculations

Libraries.Compute.Statistics.Charts

Libraries.Compute.Statistics.Clustering

Libraries.Compute.Statistics.Columns

Libraries.Compute.Statistics.DateTimeParsers

Libraries.Compute.Statistics.Distributions

Libraries.Compute.Statistics.Inputs

Libraries.Compute.Statistics.Loaders

Libraries.Compute.Statistics.Predictions

Libraries.Compute.Statistics.Reporting

Libraries.Compute.Statistics.Tests

Libraries.Compute.Statistics.Transforms

Libraries.Compute.Statistics.WindowingActions

Libraries.Containers

Libraries.Containers.Support

Libraries.Curriculum.AudioGame

Libraries.Curriculum.TurtleProgram

Libraries.Data.Compression

Libraries.Data.Database

Libraries.Data.Database.Strategies

Libraries.Data.Database.Support

Libraries.Data.Formats

Libraries.Data.Formats.ScalableVectorGraphics

Libraries.Game

Libraries.Game.Collision

Libraries.Game.Collision.Narrowphase

Libraries.Game.Collision.Shapes

Libraries.Game.Graphics

Libraries.Game.Graphics.Fonts

Libraries.Game.Graphics.ModelData

Libraries.Game.Graphics.ModelLoaders

Libraries.Game.Graphics.ModelLoaders.WavefrontObject

Libraries.Game.Graphics.Shaders

Libraries.Game.Graphics.Triangulation

Libraries.Game.Physics

Libraries.Game.Physics.Joints

Libraries.Game.Scenes