# Compute new values¶

## Change¶

```new_table = table.compute([
('2000_change', agate.Change('2000', '2001')),
('2001_change', agate.Change('2001', '2002')),
('2002_change', agate.Change('2002', '2003'))
])
```

Or, better yet, compute the whole decade using a loop:

```computations = []

for year in range(2000, 2010):
change = agate.Change(year, year + 1)
computations.append(('%i_change' % year, change))

new_table = table.compute(computations)
```

## Percent¶

Calculate the percentage for each value in a column with `Percent`. Values are divided into the sum of the column by default.

```columns = ('value',)
rows = (,,,)
new_table = agate.Table(rows, columns)

new_table = new_table.compute([
('percent', agate.Percent('value'))
])

new_table.print_table()

| value | percent |
| ----- | ------- |
|     1 |      10 |
|     2 |      20 |
|     2 |      20 |
|     5 |      50 |
```

Override the denominator with a keyword argument.

```new_table = new_table.compute([
('percent', agate.Percent('value', 5))
])

new_table.print_table()

| value | percent |
| ----- | ------- |
|     1 |      20 |
|     2 |      40 |
|     2 |      40 |
|     5 |     100 |
```

## Percent change¶

Want percent change instead of value change? Just swap out the `Computation`:

```computations = []

for year in range(2000, 2010):
change = agate.PercentChange(year, year + 1)
computations.append(('%i_change' % year, change))

new_table = table.compute(computations)
```

## Indexed/cumulative change¶

Need your change indexed to a starting year? Just fix the first argument:

```computations = []

for year in range(2000, 2010):
change = agate.Change(2000, year + 1)
computations.append(('%i_change' % year, change))

new_table = table.compute(computations)
```

Of course you can also use `PercentChange` if you need percents rather than values.

## Round to two decimal places¶

agate stores numerical values using Python’s `decimal.Decimal` type. This data type ensures numerical precision beyond what is supported by the native `float()` type, however, because of this we can not use Python’s builtin `round()` function. Instead we must use `decimal.Decimal.quantize()`.

We can use `Table.compute()` to apply the quantize to generate a rounded column from an existing one:

```from decimal import Decimal

number_type = agate.Number()

def round_price(row):
return row['price'].quantize(Decimal('0.01'))

new_table = table.compute([
('price_rounded', agate.Formula(number_type, round_price))
])
```

To round to one decimal place you would simply change `0.01` to `0.1`.

## Difference between dates¶

Calculating the difference between dates (or dates and times) works exactly the same as it does for numbers:

```new_table = table.compute([
('age_at_death', agate.Change('born', 'died'))
])
```

## Levenshtein edit distance¶

The Levenshtein edit distance is a common measure of string similarity. It can be used, for instance, to check for typos between manually-entered names and a version that is known to be spelled correctly.

Implementing Levenshtein requires writing a custom `Computation`. To save ourselves building the whole thing from scratch, we will lean on the python-Levenshtein library for the actual algorithm.

```import agate
from Levenshtein import distance
import six

class LevenshteinDistance(agate.Computation):
"""
Computes Levenshtein edit distance between the column and a given string.
"""
def __init__(self, column_name, compare_string):
self._column_name = column_name
self._compare_string = compare_string

def get_computed_data_type(self, table):
"""
The return value is a numerical distance.
"""
return agate.Number()

def validate(self, table):
"""
Verify the column is text.
"""
column = table.columns[self._column_name]

if not isinstance(column.data_type, agate.Text):
raise agate.DataTypeError('Can only be applied to Text data.')

def run(self, table):
"""
Find the distance, returning null when the input column was null.
"""
new_column = []

for row in table.rows:
val = row[self._column_name]

if val is None:
new_column.append(None)
else:
new_column.append(distance(val, self._compare_string))

return new_column
```

This code can now be applied to any `Table` just as any other `Computation` would be:

```new_table = table.compute([
('distance', LevenshteinDistance('column_name', 'string to compare'))
])
```

The resulting column will contain an integer measuring the edit distance between the value in the column and the comparison string.

## USA Today Diversity Index¶

The USA Today Diversity Index is a widely cited method for evaluating the racial diversity of a given area. Using a custom `Computation` makes it simple to calculate.

Assuming that your data has a column for the total population, another for the population of each race and a final column for the hispanic population, you can implement the diversity index like this:

```class USATodayDiversityIndex(agate.Computation):
def get_computed_data_type(self, table):
return agate.Number()

def run(self, table):
new_column = []

for row in table.rows:
race_squares = 0

for race in ['white', 'black', 'asian', 'american_indian', 'pacific_islander']:
race_squares += (row[race] / row['population']) ** 2

hispanic_squares = (row['hispanic'] / row['population']) ** 2
hispanic_squares += (1 - (row['hispanic'] / row['population'])) ** 2

new_column.append((1 - (race_squares * hispanic_squares)) * 100)

return new_column
```

We apply the diversity index like any other computation:

```with_index = table.compute([
('diversity_index', USATodayDiversityIndex())
])
```

## Simple Moving Average¶

A simple moving average is the average of some number of prior values in a series. It is typically used to smooth out variation in time series data.

The following custom `Computation` will compute a simple moving average. This example assumes your data is already sorted.

```class SimpleMovingAverage(agate.Computation):
"""
Computes the simple moving average of a column over some interval.
"""
def __init__(self, column_name, interval):
self._column_name = column_name
self._interval = interval

def get_computed_data_type(self, table):
"""
The return value is a numerical average.
"""
return agate.Number()

def validate(self, table):
"""
Verify the column is numerical.
"""
column = table.columns[self._column_name]

if not isinstance(column.data_type, agate.Number):
raise agate.DataTypeError('Can only be applied to Number data.')

def run(self, table):
new_column = []

for i, row in enumerate(table.rows):
if i < self._interval:
new_column.append(None)
else:
values = tuple(r[self._column_name] for r in table.rows[i - self._interval:i])

if None in values:
new_column.append(None)
else:
new_column.append(sum(values) / self._interval)

return new_column
```

You would use the simple moving average like so:

```with_average = table.compute([
('six_month_moving_average', SimpleMovingAverage('price', 6))
])
```