agate.tableset

This module contains the TableSet class which abstracts an set of related tables into a single data structure. The most common way of creating a TableSet is using the Table.group_by() method, which is similar to SQL’s GROUP BY keyword. The resulting set of tables each have identical columns structure.

TableSet functions as a dictionary. Individual tables in the set can be accessed by using their name as a key. If the table set was created using Table.group_by() then the names of the tables will be the group factors found in the original data.

TableSet replicates the majority of the features of Table. When methods such as TableSet.select(), TableSet.where() or TableSet.order_by() are used, the operation is applied to each table in the set and the result is a new TableSet instance made up of entirely new Table instances.

TableSet instances can also contain other TableSet’s. This means you can chain calls to Table.group_by() and TableSet.group_by() and end up with data grouped across multiple dimensions. TableSet.aggregate() on nested TableSets will then group across multiple dimensions.

class agate.tableset.TableMethodProxy(tableset, method_name)

A proxy for TableSet methods that converts them to individual calls on each Table in the set.

class agate.tableset.TableSet(group, key_name='group', key_type=None)

An group of named tables with identical column definitions. Supports (almost) all the same operations as Table. When executed on a TableSet, any operation that would have returned a new Table instead returns a new TableSet. Any operation that would have returned a single value instead returns a dictionary of values.

Parameters:
  • tables – A dictionary of string keys and Table values.
  • key_name – A name that describes the grouping properties. Used as the column header when the groups are aggregated. Defaults to the column name that was grouped on.
  • key_type – An instance some subclass of DataType. If not provided it will default to a :class`.Text`.
key_name

Get the name of the key this TableSet is grouped by. (If created using Table.group_by() then this is the original column name.)

key_type

Get the DataType this TableSet is grouped by. (If created using Table.group_by() then this is the original column type.)

classmethod from_csv(dir_path, column_info, header=True, **kwargs)

Create a new TableSet from a directory of CSVs. This method will use csvkit if it is available, otherwise it will use Python’s builtin csv module.

kwargs will be passed through to csv.reader().

If you are using Python 2 and not using csvkit, this method is not unicode-safe.

Parameters:
  • dir_path – Path to a directory full of CSV files. All CSV files in this directory will be loaded.
  • column_info – A sequence of pairs of column names and types. The latter must be instances of DataType. Or, an instance of TypeTester to infer types.
  • header – If True, the first row of the CSV is assumed to contains headers and will be skipped.
to_csv(dir_path, **kwargs)

Write this each table in this set to a separate CSV in a given directory. This method will use csvkit if it is available, otherwise it will use Python’s builtin csv module.

kwargs will be passed through to csv.writer().

If you are using Python 2 and not using csvkit, this method is not unicode-safe.

Parameters:dir_path – Path to the directory to write the CSV files to.
column_types

Get an ordered list of this TableSet‘s column types.

Returns:A tuple of Column instances.
column_names

Get an ordered list of this TableSet‘s column names.

Returns:A tuple of strings.
merge()

Convert this TableSet into a single table. This is the inverse of Table.group_by().

Returns:A new Table.
aggregate(aggregations=[])

Aggregate data from the tables in this set by performing some set of column operations on the groups and coalescing the results into a new Table.

aggregations must be a list of tuples, where each has three parts: a column_name, a Aggregation instance and a new_column_name.

Parameters:aggregations – An list of triples in the format (column_name, aggregation, new_column_name).
Returns:A new Table.
get(k[, d]) → D[k] if k in D, else d. d defaults to None.
items() → list of D's (key, value) pairs, as 2-tuples
iteritems() → an iterator over the (key, value) items of D
iterkeys() → an iterator over the keys of D
itervalues() → an iterator over the values of D
keys() → list of D's keys
monkeypatch(patch_cls)

Dynamically add patch_cls as a base class of this class.

Parameters:patch_cls – The class to be patched on.
values() → list of D's values