...

/

Out-of-Sample Forecasting and Evaluation

Out-of-Sample Forecasting and Evaluation

Learn to perform out-of-sample forecasts and how to evaluate them.

Motivation

The ultimate goal of many time series models is to be able to predict the future realizations of a process. A good time series model needs to forecast at an acceptable level of accuracy in production, using data that it hasn’t seen before. In other words, the model must perform well out-of-sample.

Out-of-sample forecasting can come in many flavors, though. When developing our solution, we need to ask ourselves questions such as:

  • Do we want to forecast one or multiple steps ahead?

  • How many steps ahead is it useful/necessary to forecast? How many of these can we confidently predict?

  • How often do we want to update our model? Is it worth the cost?

  • What is an acceptable level of performance?

Our use case will usually determine the answer to most of these questions. Demand prediction, for instance, might require a constant update of our model and near real-time predictions. For more stable processes, such as our temperatures dataset, we might not need to update our model very often. However, we might still need to update our multiple-step-ahead forecasts as soon as we get new data. The question about the acceptable level of performance requires choosing an evaluation metric. This will typically be the MSE or the RMSE.

Appending new data

The ARIMAResults object in statsmodels has a handy method called append() for updating the model’s data. This method creates a new results object with an updated dataset, including the latest data that comes after the original model’s last observation. For instance, imagine that we have trained a model using temperatures from January 1, 2020, to December 31, 2020. If we were to apply the append() method, we’d need to give it data starting from January 1, 2021.

The code snippet below produces the documentation of the append() method:

Press + to interact
from statsmodels.tsa.arima.model import ARIMAResults
help(ARIMAResults.append)

An important thing to notice about the append() method is that it will create a wholly new ARIMAResults object with the new data it’s been given. This means that the original training data and residuals will be forgotten. If we want to keep the training data on top of the new data, the argument endog has to contain both sets, not just the new one.

The append() method is quite useful when we want to produce one-step-ahead forecasts in a rolling-window fashion without training a new model every time. Let’s see how:

Press + to interact
import pandas as pd
import matplotlib.pyplot as plt
from statsmodels.iolib.smpickle import load_pickle
import numpy as np
# Read and format data
df_monthly = pd.read_csv('SF_AVTEMP_MONTHLY.csv')
df_monthly['Date'] = pd.to_datetime(df_monthly['Date'])
df_monthly.index = df_monthly['Date']
# Get the out-of-sample set (the set that the model was not trained on)
oo = df_monthly[-12:]
# Read model
model = load_pickle('sarima_100_100_12')
# Set placeholders for forecasts and confidence interval limits
rolling_forecasts = []
rolling_lower_ci = []
rolling_upper_ci = []
# Predict one step at a time in a loop
for i in range(len(oo)):
# Get the one-step-ahead forecast and CIs
forecast = float(model.get_forecast(1).summary_frame()['mean'])
lower_ci = float(model.get_forecast(1).summary_frame()['mean_ci_lower'])
upper_ci = float(model.get_forecast(1).summary_frame()['mean_ci_upper'])
# Append forecast to placeholder lists
rolling_forecasts.append(forecast)
rolling_lower_ci.append(lower_ci)
rolling_upper_ci.append(upper_ci)
# Extend model with latest observation
model = model.append([oo['avtemp'].iloc[i]])
# Plot actual series and predictions
fig, ax = plt.subplots(1,1, figsize = (12,7))
ax.plot(oo['avtemp'].values, color = 'purple', marker = 'o', label = 'Actual values')
ax.plot(rolling_forecasts, color = 'orange', marker = 'o', label = 'Forecast')
ax.fill_between(range(0, 12), rolling_lower_ci, rolling_upper_ci, color='grey', alpha=0.2, label='95% confidence interval')
ax.set_ylabel('Temperature (F°)')
ax.set_xlabel('Month')
date_range = pd.date_range(start='2021-01-01', end='2021-12-31', freq='M')
ax.set_xticks(range(len(date_range)))
ax.set_xticklabels(date_range.strftime('%b %Y'), rotation=45)
ax.legend()
plt.show()

The code above produces one year’s worth of monthly average temperature forecasts as one-step-ahead forecasts with the append() method. The crucial part happens between lines 18 and 37, right after reading the model and the out-of-sample data:

  • Lines 19–21: We create some placeholder lists where we will store our forecasts and the boundaries of their confidence levels.

  • Line 24: We start a loop for as many steps as months we want to forecast. In this case, it is the length of the out-of-sample data: one year.

  • Lines 27–29: We calculate the one-step-ahead forecast with the get_forecast method of the model object (an ARIMAResults object). Here, we also extract the confidence interval ...