agate.data_types

This module contains the DataType class and its subclasses. These types define how data should be converted during the creation of a Table.

A TypeTester class is also included which be used to infer data types from column data.

class agate.data_types.TypeTester(force={}, locale='en_US', limit=None)

Infer data types for the columns in a given set of data.

Parameters:
  • force – A dictionary where each key is a column name and each value is a DataType instance that overrides inference.
  • locale – A locale to use when evaluating the types of data. See Number.
  • limit – An optional limit on how many rows to evaluate before selecting the most likely type. Note that applying a limit may mean errors arise when the data is cast–if the guess is proved incorrect in further rows of data.
run(rows, column_names)

Apply type inference to the provided data and return an array of column types.

Parameters:rows – The data as a sequence of any sequences: tuples, lists, etc.
agate.data_types.base.DEFAULT_NULL_VALUES = ('', 'na', 'n/a', 'none', 'null', '.')

Default values which will be automatically cast to None

class agate.data_types.base.DataType(null_values=('', 'na', 'n/a', 'none', 'null', '.'))

Bases: object

Base class for data types.

Parameters:null_values – A sequence of values which should be cast to None when encountered with this type.
test(d)

Test, for purposes of type inference, if a value could possibly be coerced to this data type.

This is really just a thin wrapper around DataType.cast().

cast(d)

Coerce a given string value into this column’s data type.

csvify(d)

Format a given native value for CSV serialization.

jsonify(d)

Format a given native value for JSON serialization.

agate.data_types.boolean.DEFAULT_TRUE_VALUES = ('yes', 'y', 'true', 't', '1')

Default values which will be automatically cast to True.

agate.data_types.boolean.DEFAULT_FALSE_VALUES = ('no', 'n', 'false', 'f', '0')

Default values which will be automatically cast to False.

class agate.data_types.boolean.Boolean(true_values=('yes', 'y', 'true', 't', '1'), false_values=('no', 'n', 'false', 'f', '0'), null_values=('', 'na', 'n/a', 'none', 'null', '.'))

Bases: agate.data_types.base.DataType

Data type representing boolean values.

Note: numerical 1 and 0 are considered valid boolean values, but no other numbers are.

Parameters:
  • true_values – A sequence of values which should be cast to True when encountered with this type.
  • false_values – A sequence of values which should be cast to False when encountered with this type.
cast(d)

Cast a single value to bool.

Parameters:d – A value to cast.
Returns:bool or None.
jsonify(d)
class agate.data_types.date.Date(date_format=None, **kwargs)

Bases: agate.data_types.base.DataType

Data type representing dates only.

Parameters:date_format – A formatting string for datetime.datetime.strptime() to use instead of using regex-based parsing.
cast(d)

Cast a single value to a datetime.date.

Parameters:date_format – An optional datetime.strptime() format string for parsing datetimes in this column.
Returns:datetime.date or None.
csvify(d)
jsonify(d)
class agate.data_types.date_time.DateTime(datetime_format=None, timezone=None, **kwargs)

Bases: agate.data_types.base.DataType

Data type representing dates and times.

Parameters:
  • datetime_format – A formatting string for datetime.datetime.strptime() to use instead of using regex-based parsing.
  • timezone – A pytz timezone to apply to each parsed date.
cast(d)

Cast a single value to a datetime.datetime.

Parameters:date_format – An optional datetime.strptime() format string for parsing datetimes in this column.
Returns:datetime.datetime or None.
csvify(d)
jsonify(d)
agate.data_types.number.CURRENCY_SYMBOLS = [u'\u060b', u'$', u'\u0192', u'\u17db', u'\xa5', u'\u20a1', u'\u20b1', u'\xa3', u'\u20ac', u'\xa2', u'\ufdfc', u'\u20aa', u'\u20a9', u'\u20ad', u'\u20ae', u'\u20a6', u'\u0e3f', u'\u20a4', u'\u20ab']

A list of currency symbols sourced from Xe.

class agate.data_types.number.Number(locale='en_US', float_precision=10, **kwargs)

Bases: agate.data_types.base.DataType

Data type representing numbers.

Parameters:
  • locale – A locale specification such as en_US or de_DE to use for parsing formatted numbers.
  • float_precision – An integer specifying how many decimal places to include when converting Python’s native floats to Decimals. Beyond this point values will be rounded. This does not apply to string representations of fractional numbers.
cast(d)

Cast a single value to a decimal.Decimal.

Returns:decimal.Decimal or None.
jsonify(d)
class agate.data_types.text.Text(null_values=('', 'na', 'n/a', 'none', 'null', '.'))

Bases: agate.data_types.base.DataType

Data type representing text.

cast(d)

Cast a single value to unicode() (str() in Python 3).

Parameters:d – A value to cast.
Returns:unicode() (str() in Python 3) or None