DataFrame Class

Summary

Allows tables and other data to be loaded into memory and manipulated. Designed to mimic components of R packages `dplyr` and `tidyr`

Constructor

DataFrame( [string or options_array table, array descriptions, array groups] )

Argument Contents
table Optional. If string, the filename of the table to be opened (only *.csv or *.bin are currently supported )If options array, each option name is the name of a field in the table, and the option values are an array of values. See example below. If null, an empty dataframe is created
descriptions Optional. Provides descriptions for each column in the data frame. Names must match column names in the tableargumentThe descriptions will only be visible if written to a bin file
groups Optional. Lists the groupings fields. See the group_by() method.

Example

tbl.a = {1, 3, 5}
tbl.b = {"a", "b", "c"}
desc.a = "This is a column of numbers."
desc.b = "This is a column of letters."
df = CreateObject("DataFrame" , tbl, desc)
df.view()

Methods

arrange( array fields )

Sorts a table based on a list of fields.

Argument Contents
fields A list of fields to sort by

Example

tbl.a = {1, 5,5, 3}
tbl.b = {"a", "c","b", "d"}
desc.a = "This is a column of numbers."
desc.b = "This is a column of letters."
df = CreateObject("DataFrame" , tbl, desc)
df.arrange({"a" ,"b"})
df.view()

bin_field( options array )

Creates a field of categories based on a continuous numeric field.

Option Type Contents
Name of the continuous field to be "binned"
bins int or array If int, the number of bins to create. The range of the in_field will be divided up evenly if array of integers. Each array element is the starting of a bin. The end of the last bin is assumed to be the max value in the field. e.g. {0, 1} is: 0 <= x < 1 1 <= x < [max number] .
labels array Optional. The names of the bins. If bins is a list, the array length must be 1 less the length f bins. If bins is a number, the array length must be the same as bins. If null, bins will be labelled 1-n

Example

tbl.a = {1, 2,3, 4}
tbl.b = {"a", "c","b", "d"}
desc.a = "This is a column of numbers."
desc.b = "This is a column of letters."
df = CreateObject("DataFrame" , tbl, desc)
df.bin_field({"in_field": "a", "bins":2, "labels":{"kkk","kkk2"}})
df.view()

bind_rows(DataFrame df )

Appends the rows of one data frame to another. Both data frames should have the same columns.

Argument Contents
df The data frame to be appended. It must have the same number of columns as the current dataframe

Example

tbl1.a = {1, 2,3, 4}
tbl1.b = {"a", "b","v", "d"}
df = CreateObject("DataFrame" , tbl1, desc)
tbl2.a = {4, 5,6, 7}
tbl2.b = {"m", "n","o", "p"}
df2 = CreateObject("DataFrame" , tbl2, desc)
df.bind_rows(df2)
df.view()

check()

Checks that the data frame is valid.

Returns

The data frame if check is successful, an error with a descriptive message otherwise

colnames(array options)

Either returns vector of all column names or sets all column names. Use rename() to change individual column names.

Option Type Contents
new_names array or vector Optional. A list of the new column name, one for each existing column in the data frame.
start string Optional. The name of the first column to be returned, defaults to first column
labels string Optional. The name of the first column to be returned, defaults to lastcolumn

Returns

An array of column names

Example

tbl1.ID = {1, 2,3, 4}
tbl1.SHOP_TYPE = {"Bakery", "Restaurant","Beauty Salon", "Sandy"}
tbl1.OwnerlastName = {"Smith", "Jones","Christie", "Good"}
df = CreateObject("DataFrame" , tbl1, desc)
df.colnames( { "new_names": {"ID", "category", "owner" }})
Showarray(df.colnames()) // show all the column names
Showarray(df.colnames({"start": "category"}) ) // Show colum names, start with category

coltypes()

Gets the column types

Returns

An array of column types Possible types returned are: short, long, double, string

colwidths(array col_names)

Gets the column types

Argument Contents
col_names Optional.

Returns

An array of column types Possible types returned are: short, long, double, string

copy()

Creates a complete copy of the data frame.

Example

// if you use new_df = old_df you simply get two variable names that point to the same object
// Instead, use:
new_df = old_df.copy()

filter(string query)

Applies a query to a table object.

Argument Contents
query A valid query string. (e.g. "ID = 5" , "Name = 'Sam'" , "id > 10 and size > 100" ). The "Select *where" clause as used in GISDK queries is optional

Example

tbl1.a = {1, 2,3, 4}
tbl1.b = {"a", "b","c", "d"}
df = CreateObject("DataFrame" , tbl1, desc)
tbl2.a = {4, 5,6, 7}
tbl2.b = {"m", "n","o", "p"}
df2 = CreateObject("DataFrame" , tbl2, desc)
df.filter( "a > 2")
df.view()

gather(array cols, string value, string or numeric fill)

Transforms data from wide to long format. Places the names of multiple columns into a single "key" column and places the values of those multiple columns into a single "value" column. Reverse of spread().

Argument Contents
cols A list of the fields to gather
key The column whose values will become new column name
value The column whose values will fill the new columns
fill The string/number to fill into empty data cells of new columns

Example

tbl1.ID = {1, 2,3, 4}
tbl1.SHOPTYPE = {"Bakery", "Restaurant","Beauty Salon", "Grocery" }
tbl1.OwnerlastName = {"John", "Rita", "Paul", "Mary"}
df = CreateObject("DataFrame" , tbl1)
df.gather( {"ID", "Shoptype", "OwnerlastName"}, "key", "value" )
df.view()

read( string filename, [ array fields, options_array expr_vars] )

Reads a DataFrame from a file, either a CSV or a FFB (*.bin) based on the file extension.

Argument Contents
filename Full path of file. The file type is inferred from the extension
fields Optional. Array of columns names to read. If null, all columns are read
value Optional. Name/value pairs such that "{Name}" found in the file will be replaced with Value

read_bin( string filename, [ array fields, options_array expr_vars] )

Reads a DataFrame from a FFB (*.bin) file.

Argument Contents
filename Full path of file. The file type is inferred from the extension
fields Optional. Array of columns names to read. If null, all columns are read
value Optional. Name/value pairs such that "{Name}" found in the file will be replaced with Value

read_csv( string filename, [ array fields, options_array expr_vars] )

Reads a DataFrame from a CSV file.

Argument Contents
filename Full path of file. The file type is inferred from the extension
fields Optional. Array of columns names to read. If null, all columns are read
value Optional. Name/value pairs such that "{Name}" found in the file will be replaced with Value

read_view( string filename, [ string set, array fields, options_array expr_vars, string null_to_zero] )

Converts a view into a data frame. Useful if you want to specify a selection set or already have a view open.

Argument Contents
filename Full path of file. The file type is inferred from the extension
set Optional. A set name
fields Optional. Array of columns names to read. If null, all columns are read
value Optional. Name/value pairs such that "{Name}" found in the file will be replaced with Value
null_to_zero Optional. Whether to convert null values to zero. Either "true" or "false". Defaults to false
include_descriptions Optional. Whether to include field descriptions. Not applicable for all table types. Either "true" or "false". Defaults to false

See Also:

Alphabetical List of GISDK Classes

 

©2026 Caliper Corporation