Homogenize rows#
Fill in missing rows in a series. This can be used, for instance, to add rows for missing years in a time series.
Create rows for missing values#
We can insert a default row for each value that is missing in a table from a given sequence of values.
Starting with a table like this, we can fill in rows for all missing years:
year |
female_count |
male_count |
---|---|---|
1997 |
2 |
1 |
2000 |
4 |
3 |
2002 |
4 |
5 |
2003 |
1 |
2 |
key = 'year'
expected_values = (1997, 1998, 1999, 2000, 2001, 2002, 2003)
# Your default row should specify column values not in `key`
default_row = (0, 0)
new_table = table.homogenize(key, expected_values, default_row)
The result will be:
year |
female_count |
male_count |
---|---|---|
1997 |
2 |
1 |
1998 |
0 |
0 |
1999 |
0 |
0 |
2000 |
4 |
3 |
2001 |
0 |
0 |
2002 |
4 |
5 |
2003 |
1 |
2 |
Create dynamic rows based on missing values#
We can also specify new row values with a value-generating function:
key = 'year'
expected_values = (1997, 1998, 1999, 2000, 2001, 2002, 2003)
# If default row is a function, it should return a full row
def default_row(missing_value):
return (missing_value, missing_value-1997, missing_value-1997)
new_table = table.homogenize(key, expected_values, default_row)
The new table will be:
year |
female_count |
male_count |
---|---|---|
1997 |
2 |
1 |
1998 |
1 |
1 |
1999 |
2 |
2 |
2000 |
4 |
3 |
2001 |
4 |
4 |
2002 |
4 |
5 |
2003 |
1 |
2 |