Out-of-Sample Forecasting and Evaluation
Learn to perform out-of-sample forecasts and how to evaluate them.
We'll cover the following...
Motivation
The ultimate goal of many time series models is to be able to predict the future realizations of a process. A good time series model needs to forecast at an acceptable level of accuracy in production, using data that it hasn’t seen before. In other words, the model must perform well out-of-sample.
Out-of-sample forecasting can come in many flavors, though. When developing our solution, we need to ask ourselves questions such as:
Do we want to forecast one or multiple steps ahead?
How many steps ahead is it useful/necessary to forecast? How many of these can we confidently predict?
How often do we want to update our model? Is it worth the cost?
What is an acceptable level of performance?
Our use case will usually determine the answer to most of these questions. Demand prediction, for instance, might require a constant update of our model and near real-time predictions. For more stable processes, such as our temperatures dataset, we might not need to update our model very often. However, we might still need to update our multiple-step-ahead forecasts as soon as we get new data. The question about the acceptable level of performance requires choosing an evaluation metric. This will typically be the MSE or the RMSE.
Appending new data
The ARIMAResults
object in statsmodels has a handy method called append()
for updating the model’s data. This method creates a new results object with an updated dataset, including the latest data that comes after the original model’s last observation. For instance, imagine that we have trained a model using temperatures from January 1, 2020, to December 31, 2020. If we were to apply the append()
method, we’d need to give it data starting from January 1, 2021.
The code snippet below produces the documentation of the append()
method:
from statsmodels.tsa.arima.model import ARIMAResultshelp(ARIMAResults.append)
An important thing to notice about the append()
method is that it will create a wholly new ARIMAResults
object with the new data it’s been given. This means that the original training data and residuals will be forgotten. If we want to keep the training data on top of the new data, the argument endog
has to contain both sets, not just the new one.
The append()
method is quite useful when we want to produce one-step-ahead forecasts in a rolling-window fashion without training a new model every time. Let’s see how:
import pandas as pdimport matplotlib.pyplot as pltfrom statsmodels.iolib.smpickle import load_pickleimport numpy as np# Read and format datadf_monthly = pd.read_csv('SF_AVTEMP_MONTHLY.csv')df_monthly['Date'] = pd.to_datetime(df_monthly['Date'])df_monthly.index = df_monthly['Date']# Get the out-of-sample set (the set that the model was not trained on)oo = df_monthly[-12:]# Read modelmodel = load_pickle('sarima_100_100_12')# Set placeholders for forecasts and confidence interval limitsrolling_forecasts = []rolling_lower_ci = []rolling_upper_ci = []# Predict one step at a time in a loopfor i in range(len(oo)):# Get the one-step-ahead forecast and CIsforecast = float(model.get_forecast(1).summary_frame()['mean'])lower_ci = float(model.get_forecast(1).summary_frame()['mean_ci_lower'])upper_ci = float(model.get_forecast(1).summary_frame()['mean_ci_upper'])# Append forecast to placeholder listsrolling_forecasts.append(forecast)rolling_lower_ci.append(lower_ci)rolling_upper_ci.append(upper_ci)# Extend model with latest observationmodel = model.append([oo['avtemp'].iloc[i]])# Plot actual series and predictionsfig, ax = plt.subplots(1,1, figsize = (12,7))ax.plot(oo['avtemp'].values, color = 'purple', marker = 'o', label = 'Actual values')ax.plot(rolling_forecasts, color = 'orange', marker = 'o', label = 'Forecast')ax.fill_between(range(0, 12), rolling_lower_ci, rolling_upper_ci, color='grey', alpha=0.2, label='95% confidence interval')ax.set_ylabel('Temperature (F°)')ax.set_xlabel('Month')date_range = pd.date_range(start='2021-01-01', end='2021-12-31', freq='M')ax.set_xticks(range(len(date_range)))ax.set_xticklabels(date_range.strftime('%b %Y'), rotation=45)ax.legend()plt.show()
The code above produces one year’s worth of monthly average temperature forecasts as one-step-ahead forecasts with the append()
method. The crucial part happens between lines 18 and 37, right after reading the model and the out-of-sample data:
Lines 19–21: We create some placeholder lists where we will store our forecasts and the boundaries of their confidence levels.
Line 24: We start a loop for as many steps as months we want to forecast. In this case, it is the length of the out-of-sample data: one year.
Lines 27–29: We calculate the one-step-ahead forecast with the
get_forecast
method of themodel
object (anARIMAResults
object). Here, we also extract the confidence interval ...