...
/Applying Convolutional Networks to Multivariate Time Series
Applying Convolutional Networks to Multivariate Time Series
Discover convolutional networks’ application in predicting rare events within multivariate time series data.
The rare event prediction problem explored in this course is a multivariate time series. Let’s proceed with modeling it with convolutional networks.
Convolution on time series
Before modeling, let’s briefly explore the filters and convolution operation in the context of multivariate time series.
A multivariate time series structure is shown in the image below. It shows an illustrative example in which the x-, y-, and z-axis, show the time, the features, and the features’ values,
The time series in the illustration has three features with rectangular-, upward pulse-, and downward pulse-like movements. The features are placed along with the depth, making them the channels. A filter for such a time series is shown in the illustration below.
The convolution operation between the filter and the time series is shown in the illustration below. As time series has only one spatial axis along time, the convolution sweeps it over time. At each stride, a similarity between the filter and a section of the time series is emitted (not shown in the figure). The convolution variants, such as padding, stride , dilation, and , work similarly along the time axis.
Imports and data preparation
Like always, the modeling starts with importing the required libraries, including the user-defined ones.
import pandas as pdimport numpy as npimport tensorflow as tffrom tensorflow.keras import optimizersfrom tensorflow.keras.models import Modelfrom tensorflow.keras.models import Sequentialfrom tensorflow.keras.layers import Inputfrom tensorflow.keras.layers import Densefrom tensorflow.keras.layers import Dropoutfrom tensorflow.keras.layers import Conv1Dfrom tensorflow.keras.layers import Conv2Dfrom tensorflow.keras.layers import MaxPool1Dfrom tensorflow.keras.layers import AveragePooling1Dfrom tensorflow.keras.layers import MaxPool2Dfrom tensorflow.keras.layers import ReLUfrom tensorflow.keras.layers import Flattenfrom tensorflow.python.keras import backend as Kfrom sklearn.preprocessing import StandardScalerfrom sklearn.model_selection import train_test_splitfrom collections import Counterimport matplotlib.pyplot as pltimport seaborn as sns# user-defined librariesimport datapreprocessing as dpimport performancemetrics as pmimport simpleplots as spfrom numpy.random import seedseed (1)SEED = 123 # used to help randomly select the data pointsDATA_SPLIT_PCT = 0.2from pylab import rcParamsrcParams['figure.figsize'] = 8, 6plt.rcParams.update({'font.size': 22})print( " Data split percent: ", DATA_SPLIT_PCT )print( " Random generator seeds: ", SEED )print( " Size of figures to be plotted later: ", rcParams['figure.figsize'] )
The above code shows data split percent, random generator seed, and size of figures as an output.
The tensor shape of a multivariate time series in a convolutional network is the same as in an
df = pd.read_csv("processminer-sheet-break-rare-event-dataset.csv")df.head(n=5) # visualize the data.# Hot encodinghotencoding1 = pd.get_dummies(df['Grade&Bwt'])hotencoding1 = hotencoding1.add_prefix('grade_')hotencoding2 = pd.get_dummies(df['EventPress'])hotencoding2 = hotencoding2.add_prefix('eventpress_')df = df.drop(['Grade&Bwt', 'EventPress'], axis=1)df = pd.concat([df, hotencoding1 , hotencoding2], axis =1)# Rename response column name for ease of understandingdf = df.rename(columns={'SheetBreak': 'y'})# Shift the response column y by 2 rows to do a 4- min ahead prediction.df = dp.curve_shift(df, shift_by=-2)# Sort by time and drop the time column.df['DateTime'] = pd.to_datetime(df.DateTime)df = df.sort_values(by='DateTime')df = df.drop(['DateTime'], axis=1)# Converts df to numpy arrayinput_X = df.loc[:, df.columns != 'y'].valuesinput_y = df['y'].valuesprint(df)
The above shows clean data as output, displaying different dataset columns.
Baseline
As ...