Time series forecast is a very commen problem in many industries, like price forecast in financial investment, weather forecast for renewable energy production, sales forecast for business and so on. To solve this type of problem, the analyst usually goes through following steps: explorary data analysis, data preprocessing, feature engineering, comparing different forecast models, model selection and evaluation.
Case 1: Univariate Time-series Forecasting
The first case demonstrates a typical process of time series forecast of hourly wet bulb temperature in Singapore. Dataset is published by National Environment Agency, recorded at the Changi Climate Station from 1/1/1982 to 11/30/2021.
Link: https://data.gov.sg/dataset/wet-bulb-temperature-hourly
Case 2: Multivariate Time-series Forecasting
The second demonstrates a case study of multivariate time series forecast. It uses Jena Climate dataset recorded by the Max Planck Institute for Biogeochemistry. The dataset consists of 21 features such as temperature, pressure, humidity etc, recorded once per 10 minutes, from 01/01/2021 to 07/01/2021.
Link: https://www.bgc-jena.mpg.de/wetter/
Raw data include following parameters as example, the target is to predict future temperature:
- Temperature in Kelvin Tpot (K)
- Relative Humidityrh (%)
- Wind Speed wv (m/s)
Correlation matrix indicates they are not highly correlated, thus not redundant.
Result shows the temperature forecast with three parameters does perform better than that with only one parameter.
Case 3: Probabilistic RNN Forecasting
Some models are able to do probabilistic forecasting by assigning likelihood function. By default TimeSeries.plot()
shows the median as well as the 5th and 95th percentiles (of the marginal distributions, if the TimeSeries is multivariate). It is possible to control by setting low_quantile and high_quantile. Predicted object includes N_steps to forecast and N_samples to predicted at each step, as a data array of (N_steps, N_samples).
Darts
darts is a Python library for easy manipulation and forecasting of time series. It contains a variety of models, from classics such as ARIMA to deep neural networks. The library makes it easy to compare various models’ performance, backtest models, and combine the predictions of several models and external regressors.
Models available in DARTS: (7 Jan, 2022)
Model | Univariate | Multivariate | Probabilistic | Multiple-series training | Past-observed covariates support | Future-known covariates support | Reference |
---|---|---|---|---|---|---|---|
ARIMA |
✅ | ✅ | ✅ | ||||
VARIMA |
✅ | ✅ | ✅ | ||||
AutoARIMA |
✅ | ✅ | |||||
ExponentialSmoothing |
✅ | ✅ | |||||
Theta and FourTheta |
✅ | Theta & 4 Theta | |||||
Prophet |
✅ | ✅ | ✅ | Prophet repo | |||
FFT (Fast Fourier Transform) |
✅ | ||||||
RegressionModel (incl RandomForest , LinearRegressionModel and LightGBMModel ) |
✅ | ✅ | ✅ | ✅ | ✅ | ||
RNNModel (incl. LSTM and GRU); equivalent to DeepAR in its probabilistic version |
✅ | ✅ | ✅ | ✅ | ✅ | DeepAR paper | |
BlockRNNModel (incl. LSTM and GRU) |
✅ | ✅ | ✅ | ✅ | ✅ | ||
NBEATSModel |
✅ | ✅ | ✅ | ✅ | ✅ | N-BEATS paper | |
TCNModel |
✅ | ✅ | ✅ | ✅ | ✅ | TCN paper, DeepTCN paper, blog post | |
TransformerModel |
✅ | ✅ | ✅ | ✅ | ✅ | ||
TFTModel (Temporal Fusion Transformer) |
✅ | ✅ | ✅ | ✅ | ✅ | ✅ | TFT paper, PyTorch Forecasting |
Naive Baselines | ✅ |