Column Splitting

This tutorial demonstrates how to split columns in our datasets

Splitting a Column

Let us immediately start off with an example to explain what it means to column split. Imagine for a moment that we had a DataFrame below:

"player,year,season,stats 1,A,1,22-2 2,A,2,29-3 3,B,1,18-6 4,B,2,11-8 5,C,1,12-5 6,C,2,19-2"

The basic idea in splitting columns is we need a way to separate out the value in the cells. For simple cases, like if there is a character between them, we can use the SplitSelectedColumns(Array, text) action. For this, we first provide an array of the names of the headers for the new columns. Second, we provide it the character we want to split up the values by. Here is a brief description of how SplitSelectedColumns(Array, text) work:

Calculate Columns Helper Functions
Class / ActionDescriptionUsage
dataFrameObject:SplitSelectedColumns(Array textHeader, text separatorChar)This action takes in an array that would be the new header for the created column as well as the character we are looking for to separate values by. The character will go through that selected column and create a new header column with just the separated value.frame:SplitSelectedColumns(newHeader,"-")

For this example, we will be using a sample file of player stats for an online game, league.csv. Here is an example of the CSV file:

League of Legends Stats
PlayerTierLPLevelWin RatioKills/Deaths/Assists
jianhunChallenger117053156-1124/33/11
TES 77Challenger113460209-1614/33/11
Shadow BongBongChallenger1121241312-2534/33/11
KING MIRChallenger1084654406-3554/33/11
We can click this link to download the file for the CSV.

What we want to do is try and separate the Win Ratio column so on one column, we have the wins and the next column we have the loses. So let us start off with downloading the provided file here. We will then want to create a DataFrame object, which we can call 'frame.' Our 'frame' object will contain all the actions we need in order to successfully split our column. We first want to load the file, which we can do by calling Load(text fileLocation) and put it in the file location of our league.csv file.

use Libraries.Compute.Statistics.DataFrame
use Libraries.Containers.Array

DataFrame frame
frame:Load("data/league.csv")

Next, we want to create a text array in which they will be the newly separated columns. Since we want to separate wins and loses, we will use the Add(text arrayElement) to add heading titles into our array.

//create the names for the new columns
Array headers
headers:Add("Wins")
headers:Add("Loses")

Next, we have to select the column we are trying to separate. Remember that in a CSV file, columns start off with 0 so in this case, the column with the non-separated wins and loses will be column 4. We would call the action AddSelectedColumn(integer columnNumber) and pass in 4 which will read in the column of 'Win Ratio.' An alternative method that accomplishes the same read-in is to use the action AddSelectedColumns(text columnHeader) which we would pass in the 'Win Ratio' column header and read in that same column. For this tutorial, we will just add the numerical assigned column.

frame:AddSelectedColumn(4)

The last step is to call the action SplitSelectedColumns(Array , text) on our 'frame' object. For the first parameter, we will pass in the 'headers' array we created previously and in the second parameter, we will want to pass in the character we want to separate the column by. What this does is that it will search through the column and find that separator character. Once it finds the character, it separates the values from the left and right hand sides and puts them into separate columns. In this dataset, the values we want to separate are divided by a '-' so we will be inserting that character.

We also want to save our newly created columns into a new DataFrame so we will assign this modified frame into a new frame called 'result.' We finally can output both DataFrames onto the console to see our changes.

DataFrame result = frame:SplitSelectedColumns(headers, "-")

Below is an example of the entire program:

use Libraries.Compute.Statistics.DataFrame
use Libraries.Containers.Array
DataFrame frame
frame:Load("data/league.csv")
    
//create the names for the new columns
Array headers
headers:Add("Wins")
headers:Add("Loses")

//set which column or columns we want to split in this way
//then split the values and output them to the console
frame:AddSelectedColumn(4)
DataFrame result = frame:SplitSelectedColumns(headers, "-")

output "original frame"
output frame:ToText()

output "column split frame of wins/loses"
output result:ToText()

Try it Yourself: Splitting

Press the blue run button to execute the code in the code editor. Press the red stop button to end the program. Your program will work when the console outputs "Build Successful!"

To view the program we have made, we can download the program file

Column Splitting With Inheritance

To view the program we have made, we can download the program file

package Libraries.Compute.Statistics.Transforms
use Libraries.Containers.Array
use Libraries.Compute.Statistics.DataFrame

class LiteralColumnSplitter is ColumnSplitter
text delimiter = undefined

action SplitValue(text value) returns Array
    if delimiter = undefined or delimiter:IsEmpty()
        alert("Cannot split the column based on an undefined text splitter.")
    end

    if value = undefined or value:IsEmpty()
        alert("Cannot split the column based on an undefined text value.")
    end

    return value:Split(delimiter)
end

action GetDelimiter returns text
    return delimiter
end

action SetDelimiter(text delimiter)
    me:delimiter = delimiter
end
end

Once we have created a new splitter, we can send it to any DataFrame using its Transform action. Now, we will go through the process on how to use this same idea with our DataFrames. Again, we will want to create a DataFrame object, which we can call it 'frame.' Our 'frame' object will contain all the actions we need in order to successfully split our column. We first want to load the file, which we can do by calling Load(text fileLocation) and put it in the file location of our league.csv file.

DataFrame frame
frame:Load("data/league.csv")

Next, we want to create a text array in which they will be the newly separated columns. Since we want to separate kills, deaths and assists, we will use the Add(text arrayElement) to add heading titles into our array.

//create the names for the new columns
Array headers
headers:Add("Kills")
headers:Add("Deaths")
headers:Add("Assists")

Finally, we want to select the column we are splitting, 'Kills/Deaths/Assists' using our action AddSelectedColumns(text header). This will load our selected column into the DataFrame in order to properly split up the entries. We also want to call another action SplitSelectedColumns(Array headers, text charDelim, bool hasDelimeter) which takes in three parameters: the array we have created, a character separating each value, and boolean to verify such delimiter in the column. This is very similar to our previous splitting action. Now that we have our column set up, let's create a new DataFrame and call it 'result' and save the result from SplitSelectedColumns().

use Libraries.Compute.Statistics.DataFrame
use Libraries.Containers.Array

DataFrame frame
frame:Load("data/league.csv")
 
//create the names for the new columns
Array headers
headers:Add("Kills")
headers:Add("Deaths")
headers:Add("Assists")

text delimiter = undefined
boolean hasDelimeter = true

// use our action to split the columns
frame:AddSelectedColumns("Kills/Deaths/Assists")
DataFrame result = frame:SplitSelectedColumns(headers, "/", hasDelimeter)

//DataFrame result = frame:Transform(splitter)
output result:ToText()
output frame:ToText()

Try it Yourself: Splitting with Inheritance

Press the blue run button to execute the code in the code editor. Press the red stop button to end the program. Your program will work when the console outputs "Build Successful!"

To view the program we have made, we can download the program file

End of Lesson

You have reached the end of the lesson for Data Transformations. To view more tutorials, press the button below or you can go back to the previous lesson pages.