...

/

Identifying and Fitting Candidate Models

Identifying and Fitting Candidate Models

Learn how to identify potential ARIMA model candidates and fit them with statsmodels.

Now that we know how the different ARIMA model specifications work, we can find the one that best describes our temperature data in San Francisco. In this lesson, we will propose a simple flow to fit some candidate models.

Let’s return now to the temperatures data, but aggregated at the monthly average level. This will make our calculations easier and faster without the loss of generality.

Press + to interact
import pandas as pd
import matplotlib.pyplot as plt
# Read and format data
df = pd.read_csv('SF_AVTEMP.csv')
df['Date'] = pd.to_datetime(df['Date'])
df.index = df['Date']
# Resample the data into monthly intervals and calculate the mean
df_monthly = df.resample('M').mean('avtemp')
# Plot resulting DataFrame
plt.figure(figsize = (10,5))
plt.plot(df_monthly['avtemp'], color = 'green')
plt.xlabel('Date')
plt.ylabel('Average monthly temperature (°F)')
plt.show()

Autocorrelogram exploration

The first step in fitting an ARIMA model is to know the order of its AR and MA components. These are the most important hyperparameters of any ARIMA(pp,dd,qq). We denote these as pp and qq , respectively. We can extend this idea to seasonal ARIMA models, SARIMA(pp,dd,qq)(PP,DD,QQ)ss. In that case, the order of the seasonal components, PP and QQ, also matters.

To identify the components of the model that our temperature series might be following, we ...