Homogenize rows#

Fill in missing rows in a series. This can be used, for instance, to add rows for missing years in a time series.

Create rows for missing values#

We can insert a default row for each value that is missing in a table from a given sequence of values.

Starting with a table like this, we can fill in rows for all missing years:

year

female_count

male_count

1997

2

1

2000

4

3

2002

4

5

2003

1

2

key = 'year'
expected_values = (1997, 1998, 1999, 2000, 2001, 2002, 2003)

# Your default row should specify column values not in `key`
default_row = (0, 0)

new_table = table.homogenize(key, expected_values, default_row)

The result will be:

year

female_count

male_count

1997

2

1

1998

0

0

1999

0

0

2000

4

3

2001

0

0

2002

4

5

2003

1

2

Create dynamic rows based on missing values#

We can also specify new row values with a value-generating function:

key = 'year'
expected_values = (1997, 1998, 1999, 2000, 2001, 2002, 2003)

# If default row is a function, it should return a full row
def default_row(missing_value):
  return (missing_value, missing_value-1997, missing_value-1997)

new_table = table.homogenize(key, expected_values, default_row)

The new table will be:

year

female_count

male_count

1997

2

1

1998

1

1

1999

2

2

2000

4

3

2001

4

4

2002

4

5

2003

1

2