Introduction to Data Science
This tutorial introduces the concept of DataFrames in Quorum.What is a DataFrame?
The key component in loading data is to use a DataFrame. A DataFrame is a series of rows and columns, like a table, except that the columns inside of this table understand information about their data and can transform it in a variety of ways. For example, one column might contain text, another might contain numbers, and another might contain integers. DataFrames can have selections, kind of like selecting a row or column in a spreadsheet, and can do operations on top of that selection.
We use DataFrames for everything in our library and always load and save it the same way. By default, Quorum data frames use the "Comma Separated Files" method for loading into DataFrame objects. Second, if we want to, we can load a DataFrame by hand, meaning we can inject and use values in a table manually.
Creating a DataFrame
To create a DataFrame, the way that we create it is defining it like a normal variable such as: DataType variableName. In this case, the data type would be of type DataFrame and we can name the DataFrame as any meaningful name. Typically, the easiest name for a DataFrame is to call it 'frame.' In many of our tutorials using DataFrames we use the naming convention 'frame' for consistency.
Code Example
Once we have this DataFrame object created, we can now use these to access any datasets we have uploaded, create charts, compute data statistics, and much more.
DataFrame frame
Try it Yourself!
Press the blue run button to execute the code in the code editor. Press the red stop button to end the program. Your program will work when the console outputs "Build Successful!"
Next Tutorial
In the next tutorial, we will discuss tidy data, which describes how quorum organizes data with tidy data.