# Compute new values¶

## Annual change¶

You could use a `Formula`

to calculate percent change, however, for your convenience agate has a built-in shortcut. For example, if your spreadsheet has a column with values for each year you could do:

```
new_table = table.compute([
(Change('2000', '2001'), '2000_change'),
(Change('2001', '2002'), '2001_change'),
(Change('2002', '2003'), '2002_change')
])
```

Or, better yet, compute the whole decade using a loop:

```
computations = []
for year in range(2000, 2010):
change = Change(year, year + 1)
computations.append((change, '%i_change' % year))
new_table = table.compute(computations)
```

## Annual percent change¶

Want percent change instead of value change? Just swap out the `Aggregation`

:

```
computations = []
for year in range(2000, 2010):
change = PercentChange(year, year + 1)
computations.append((change, '%i_change' % year))
new_table = table.compute(computations)
```

## Indexed/cumulative change¶

Need your change indexed to a starting year? Just fix the first argument:

```
computations = []
for year in range(2000, 2010):
change = Change(2000, year + 1)
computations.append((change, '%i_change' % year))
new_table = table.compute(computations)
```

Of course you can also use `PercentChange`

if you need percents rather than values.

## Round to two decimal places¶

agate stores numerical values using Python’s `decimal.Decimal`

type. This data type ensures numerical precision beyond what is supported by the native `float()`

type, however, because of this we can not use Python’s builtin `round()`

function. Instead we must use `decimal.Decimal.quantize()`

.

We can use `Table.compute()`

to apply the quantize to generate a rounded column from an existing one:

```
from decimal import Decimal
number_type = agate.Number()
def round_price(row):
return row['price'].quantize(Decimal('0.01'))
new_table = table.compute([
(Formula(number_type, round_price), 'price_rounded')
])
```

To round to one decimal place you would simply change `0.01`

to `0.1`

.

## Difference between dates¶

Calculating the difference between dates (or dates and times) works exactly the same as it does for numbers:

```
new_table = table.compute([
(Change('born', 'died'), 'age_at_death')
])
```

## Levenshtein edit distance¶

The Levenshtein edit distance is a common measure of string similarity. It can be used, for instance, to check for typos between manually-entered names and a version that is known to be spelled correctly.

Implementing Levenshtein requires writing a custom `Computation`

. To save ourselves building the whole thing from scratch, we will lean on the python-Levenshtein library for the actual algorithm.

```
import agate
from Levenshtein import distance
import six
class LevenshteinDistance(agate.Computation):
"""
Computes Levenshtein edit distance between the column and a given string.
"""
def __init__(self, column_name, compare_string):
self._column_name = column_name
self._compare_string = compare_string
super(LevenshteinDistance, self).__init__()
def get_computed_column_type(self, table):
"""
The return value is a numerical distance.
"""
return agate.Number()
def _prepare(self, table):
"""
Verify the column is text.
"""
column = table.columns[self._column_name]
if not isinstance(column.data_type, agate.Text):
raise agate.DataTypeError('Can only be applied to Text data.')
super(LevenshteinDistance, self)._prepare(table)
def run(self, row):
"""
Find the distance, returning null when the input column was null.
"""
super(LevenshteinDistance, self).run(row)
val = row[self._column_name]
if val is None:
return None
return distance(val, self._compare_string)
```

This code can now be applied to any `Table`

just as any other `Computation`

would be:

```
new_table = table.compute([
(LevenshteinDistance('column_name', 'string to compare'), 'distance')
])
```

The resulting column will contain an integer measuring the edit distance between the value in the column and the comparison string.

## USA Today Diversity Index¶

The USA Today Diversity Index is a widely cited method for evaluating the racial diversity of a given area. Using a custom `Computation`

makes it simple to calculate.

Assuming that your data has a column for the total population, another for the population of each race and a final column for the hispanic population, you can implement the diversity index like this:

```
class USATodayDiversityIndex(agate.Computation):
def get_computed_column_type(self, table):
return agate.Number()
def run(self, row):
super(USATodayDiversityIndex, self).run(row)
race_squares = 0
for race in ['white', 'black', 'asian', 'american_indian', 'pacific_islander']:
race_squares += (row[race] / row['population']) ** 2
hispanic_squares = (row['hispanic'] / row['population']) ** 2
hispanic_squares += (1 - (row['hispanic'] / row['population'])) ** 2
return (1 - (race_squares * hispanic_squares)) * 100
```

We apply the diversity index like any other computation:

```
with_index = table.compute([
(USATodayDiversityIndex(), 'diversity_index')
])
```