DataFrame Class
Summary
Allows tables and other data to be loaded into memory and manipulated. Designed to mimic components of R packages `dplyr` and `tidyr`
Constructor
DataFrame( [string or options_array table, array descriptions, array groups] )
| Argument | Contents |
|---|---|
| table | Optional. If string, the filename of the table to be opened (only *.csv or *.bin are currently supported )If options array, each option name is the name of a field in the table, and the option values are an array of values. See example below. If null, an empty dataframe is created |
| descriptions | Optional. Provides descriptions for each column in the data frame. Names must match column names in the tableargumentThe descriptions will only be visible if written to a bin file |
| groups | Optional. Lists the groupings fields. See the group_by() method. |
Example
tbl.a = {1, 3, 5}
tbl.b = {"a", "b", "c"}
desc.a = "This is a column of numbers."
desc.b = "This is a column of letters."
df = CreateObject("DataFrame" , tbl, desc)
df.view()
Methods
arrange( array fields )
Sorts a table based on a list of fields.
| Argument | Contents |
|---|---|
| fields | A list of fields to sort by |
Example
tbl.a = {1, 5,5, 3}
tbl.b = {"a", "c","b", "d"}
desc.a = "This is a column of numbers."
desc.b = "This is a column of letters."
df = CreateObject("DataFrame" , tbl, desc)
df.arrange({"a" ,"b"})
df.view()
bin_field( options array )
Creates a field of categories based on a continuous numeric field.
| Option | Type | Contents |
|---|---|---|
| Name of the continuous field to be "binned" | ||
| bins | int or array | If int, the number of bins to create. The range of the in_field will be divided up evenly if array of integers. Each array element is the starting of a bin. The end of the last bin is assumed to be the max value in the field. e.g. {0, 1} is: 0 <= x < 1 1 <= x < [max number] . |
| labels | array | Optional. The names of the bins. If bins is a list, the array length must be 1 less the length f bins. If bins is a number, the array length must be the same as bins. If null, bins will be labelled 1-n |
Example
tbl.a = {1, 2,3, 4}
tbl.b = {"a", "c","b", "d"}
desc.a = "This is a column of numbers."
desc.b = "This is a column of letters."
df = CreateObject("DataFrame" , tbl, desc)
df.bin_field({"in_field": "a", "bins":2, "labels":{"kkk","kkk2"}})
df.view()
bind_rows(DataFrame df )
Appends the rows of one data frame to another. Both data frames should have the same columns.
| Argument | Contents |
|---|---|
| df | The data frame to be appended. It must have the same number of columns as the current dataframe |
Example
tbl1.a = {1, 2,3, 4}
tbl1.b = {"a", "b","v", "d"}
df = CreateObject("DataFrame" , tbl1, desc)
tbl2.a = {4, 5,6, 7}
tbl2.b = {"m", "n","o", "p"}
df2 = CreateObject("DataFrame" , tbl2, desc)
df.bind_rows(df2)
df.view()
check()
Checks that the data frame is valid.
Returns
The data frame if check is successful, an error with a descriptive message otherwise
colnames(array options)
Either returns vector of all column names or sets all column names. Use rename() to change individual column names.
| Option | Type | Contents |
|---|---|---|
| new_names | array or vector | Optional. A list of the new column name, one for each existing column in the data frame. |
| start | string | Optional. The name of the first column to be returned, defaults to first column |
| labels | string | Optional. The name of the first column to be returned, defaults to lastcolumn |
Returns
An array of column names
Example
tbl1.ID = {1, 2,3, 4}
tbl1.SHOP_TYPE = {"Bakery", "Restaurant","Beauty Salon", "Sandy"}
tbl1.OwnerlastName = {"Smith", "Jones","Christie", "Good"}
df = CreateObject("DataFrame" , tbl1, desc)
df.colnames( { "new_names": {"ID", "category", "owner" }})
Showarray(df.colnames()) // show all the column names
Showarray(df.colnames({"start": "category"}) ) // Show colum names, start with category
coltypes()
Gets the column types
Returns
An array of column types Possible types returned are: short, long, double, string
colwidths(array col_names)
Gets the column types
| Argument | Contents |
|---|---|
| col_names | Optional. |
Returns
An array of column types Possible types returned are: short, long, double, string
copy()
Creates a complete copy of the data frame.
Example
// if you use new_df = old_df you simply get two variable names that point to the same object
// Instead, use:
new_df = old_df.copy()
filter(string query)
Applies a query to a table object.
| Argument | Contents |
|---|---|
| query | A valid query string. (e.g. "ID = 5" , "Name = 'Sam'" , "id > 10 and size > 100" ). The "Select *where" clause as used in GISDK queries is optional |
Example
tbl1.a = {1, 2,3, 4}
tbl1.b = {"a", "b","c", "d"}
df = CreateObject("DataFrame" , tbl1, desc)
tbl2.a = {4, 5,6, 7}
tbl2.b = {"m", "n","o", "p"}
df2 = CreateObject("DataFrame" , tbl2, desc)
df.filter( "a > 2")
df.view()
gather(array cols, string value, string or numeric fill)
Transforms data from wide to long format. Places the names of multiple columns into a single "key" column and places the values of those multiple columns into a single "value" column. Reverse of spread().
| Argument | Contents |
|---|---|
| cols | A list of the fields to gather |
| key | The column whose values will become new column name |
| value | The column whose values will fill the new columns |
| fill | The string/number to fill into empty data cells of new columns |
Example
tbl1.ID = {1, 2,3, 4}
tbl1.SHOPTYPE = {"Bakery", "Restaurant","Beauty Salon", "Grocery" }
tbl1.OwnerlastName = {"John", "Rita", "Paul", "Mary"}
df = CreateObject("DataFrame" , tbl1)
df.gather( {"ID", "Shoptype", "OwnerlastName"}, "key", "value" )
df.view()
read( string filename, [ array fields, options_array expr_vars] )
Reads a DataFrame from a file, either a CSV or a FFB (*.bin) based on the file extension.
| Argument | Contents |
|---|---|
| filename | Full path of file. The file type is inferred from the extension |
| fields | Optional. Array of columns names to read. If null, all columns are read |
| value | Optional. Name/value pairs such that "{Name}" found in the file will be replaced with Value |
read_bin( string filename, [ array fields, options_array expr_vars] )
Reads a DataFrame from a FFB (*.bin) file.
| Argument | Contents |
|---|---|
| filename | Full path of file. The file type is inferred from the extension |
| fields | Optional. Array of columns names to read. If null, all columns are read |
| value | Optional. Name/value pairs such that "{Name}" found in the file will be replaced with Value |
read_csv( string filename, [ array fields, options_array expr_vars] )
Reads a DataFrame from a CSV file.
| Argument | Contents |
|---|---|
| filename | Full path of file. The file type is inferred from the extension |
| fields | Optional. Array of columns names to read. If null, all columns are read |
| value | Optional. Name/value pairs such that "{Name}" found in the file will be replaced with Value |
read_view( string filename, [ string set, array fields, options_array expr_vars, string null_to_zero] )
Converts a view into a data frame. Useful if you want to specify a selection set or already have a view open.
| Argument | Contents |
|---|---|
| filename | Full path of file. The file type is inferred from the extension |
| set | Optional. A set name |
| fields | Optional. Array of columns names to read. If null, all columns are read |
| value | Optional. Name/value pairs such that "{Name}" found in the file will be replaced with Value |
| null_to_zero | Optional. Whether to convert null values to zero. Either "true" or "false". Defaults to false |
| include_descriptions | Optional. Whether to include field descriptions. Not applicable for all table types. Either "true" or "false". Defaults to false |
See Also:
|
©2026 Caliper Corporation |