Homogenize rows

Fill in missing rows in a series. This can be used, for instance, to add rows for missing years in a time series.

Create rows for missing values

We can insert a default row for each value that is missing in a table from a given sequence of values.

Starting with a table like this, we can fill in rows for all missing years:

year female_count male_count
1997 2 1
2000 4 3
2002 4 5
2003 1 2
key = 'year'
expected_values = (1997, 1998, 1999, 2000, 2001, 2002, 2003)

# Your default row should specify column values not in `key`
default_row = (0, 0)

new_table = table.homogenize(key, expected_values, default_row)

The result will be:

year female_count male_count
1997 2 1
1998 0 0
1999 0 0
2000 4 3
2001 0 0
2002 4 5
2003 1 2

Create dynamic rows based on missing values

We can also specify new row values with a value-generating function:

key = 'year'
expected_values = (1997, 1998, 1999, 2000, 2001, 2002, 2003)

# If default row is a function, it should return a full row
def default_row(missing_value):
  return (missing_value, missing_value-1997, missing_value-1997)

new_table = table.homogenize(key, expected_values, default_row)

The new table will be:

year female_count male_count
1997 2 1
1998 1 1
1999 2 2
2000 4 3
2001 4 4
2002 4 5
2003 1 2