TimeGPT on top of Dask.
Outline:
1. Installation
Install Dask through Fugue. Fugue provides an easy-to-use interface for distributed computing that lets users execute Python code on top of several distributed computing frameworks, including Dask.Note You can installIf executing on a distributedfuguewithpip:
Dask cluster, ensure that the nixtla
library is installed across all the workers.
2. Load Data
You can load your data as apandas DataFrame. In this tutorial, we
will use a dataset that contains hourly electricity prices from
different markets.
| unique_id | ds | y | |
|---|---|---|---|
| 0 | BE | 2016-10-22 00:00:00 | 70.00 |
| 1 | BE | 2016-10-22 01:00:00 | 37.10 |
| 2 | BE | 2016-10-22 02:00:00 | 37.10 |
| 3 | BE | 2016-10-22 03:00:00 | 44.75 |
| 4 | BE | 2016-10-22 04:00:00 | 37.10 |
3. Import Dask
Import Dask and convert thepandas DataFrame to a Dask DataFrame.
| unique_id | ds | y | |
|---|---|---|---|
| npartitions=2 | |||
| 0 | string | string | float64 |
| 4200 | … | … | … |
| 8399 | … | … | … |
4. Use TimeGPT on Dask
UsingTimeGPT on top of Dask is almost identical to the
non-distributed case. The only difference is that you need to use a
Dask DataFrame, which we already defined in the previous step.
First, instantiate the
NixtlaClient
class.
👍 Use an Azure AI endpoint To use an Azure AI endpoint, set theThen use any method from thebase_urlargument:nixtla_client = NixtlaClient(base_url="you azure ai endpoint", api_key="your api_key")
NixtlaClient
class such as
forecast
or
cross_validation.
| unique_id | ds | TimeGPT | |
|---|---|---|---|
| 0 | BE | 2016-12-31 00:00:00 | 45.190453 |
| 1 | BE | 2016-12-31 01:00:00 | 43.244446 |
| 2 | BE | 2016-12-31 02:00:00 | 41.958389 |
| 3 | BE | 2016-12-31 03:00:00 | 39.796486 |
| 4 | BE | 2016-12-31 04:00:00 | 39.204533 |
📘 Available models in Azure AI If you are using an Azure AI endpoint, please be sure to setmodel="azureai":nixtla_client.forecast(..., model="azureai")For the public API, we support two models:timegpt-1andtimegpt-1-long-horizon. By default,timegpt-1is used. Please see this tutorial on how and when to usetimegpt-1-long-horizon.
| unique_id | ds | cutoff | TimeGPT | |
|---|---|---|---|---|
| 0 | BE | 2016-12-30 04:00:00 | 2016-12-30 03:00:00 | 39.375439 |
| 1 | BE | 2016-12-30 05:00:00 | 2016-12-30 03:00:00 | 40.039215 |
| 2 | BE | 2016-12-30 06:00:00 | 2016-12-30 03:00:00 | 43.455849 |
| 3 | BE | 2016-12-30 07:00:00 | 2016-12-30 03:00:00 | 47.716408 |
| 4 | BE | 2016-12-30 08:00:00 | 2016-12-30 03:00:00 | 50.31665 |
TimeGPT on top of Dask. To
do this, please refer to the Exogenous
Variables
tutorial. Just keep in mind that instead of using a pandas DataFrame,
you need to use a Dask DataFrame instead.
