module utilsforecast.preprocessing
Utilities for processing data before training/analysis
function id_time_grid
df(pandas or polars DataFrame): Input datafreq(str or int): Series’ frequencystart(str, int, date or datetime, optional): Initial timestamp for the series. * ‘per_serie’ uses each serie’s first timestamp * ‘global’ uses the first timestamp seen in the data * Can also be a specific timestamp or integer, e.g. ‘2000-01-01’, 2000 or datetime(2000, 1, 1) Defaults to “per_serie”.end(str, int, date or datetime, optional): Initial timestamp for the series. * ‘per_serie’ uses each serie’s last timestamp * ‘global’ uses the last timestamp seen in the data * Can also be a specific timestamp or integer, e.g. ‘2000-01-01’, 2000 or datetime(2000, 1, 1) Defaults to “global”.id_col(str, optional): Column that identifies each serie. Defaults to ‘unique_id’.time_col(str, optional): Column that identifies each timestamp. Defaults to ‘ds’.
pandas or polars DataFrame: Dataframe with expected ids and times.
function fill_gaps
df(pandas or polars DataFrame): Input datafreq(str or int): Series’ frequencystart(str, int, date or datetime, optional): Initial timestamp for the series. * ‘per_serie’ uses each serie’s first timestamp * ‘global’ uses the first timestamp seen in the data * Can also be a specific timestamp or integer, e.g. ‘2000-01-01’, 2000 or datetime(2000, 1, 1) Defaults to “per_serie”.end(str, int, date or datetime, optional): Initial timestamp for the series. * ‘per_serie’ uses each serie’s last timestamp * ‘global’ uses the last timestamp seen in the data * Can also be a specific timestamp or integer, e.g. ‘2000-01-01’, 2000 or datetime(2000, 1, 1) Defaults to “global”.id_col(str, optional): Column that identifies each serie. Defaults to ‘unique_id’.time_col(str, optional): Column that identifies each timestamp. Defaults to ‘ds’.
pandas or polars DataFrame: Dataframe with gaps filled.

