An essential component of Python data analysis is the pandas package. It gives us the ability to work with and examine data with amazing simplicity. When working with sequential data, especially time series, one of its useful methods, shift()
, is essential.
shift()
method?Essentially, by using the shift()
method, we can move the data values in a DataFrame along a certain axis by a predetermined number of positions using the shift()
method. This movement allows us to analyze data from various angles and can be either forward (downward) or backward (upward).
shift()
useful?The shift()
method is handy in many data situations, especially when dealing with time-related information. Here are some practical ways in which it can be helpful:
Predicting future values: When we want to guess what might happen next based on past patterns, shift()
helps by creating new columns that show historical values. These columns can be used as helpful clues in our prediction model.
Comparing now and then: shift()
is great for directly comparing data at different times. It helps us spot trends, patterns, or changes in our information.
Calculating changing trends: Combine shift()
with other tools, like averages or standard deviations, to figure out how our data is changing over time. This gives us insights into the overall trend.
shift()
work?The shift()
function needs one essential thing: the number of positions to move the data. If we use a positive number, it shifts the data down, and if it’s a negative number, it shifts the data up.
Here is the syntax of shift()
method in Python:
DataFrame.shift(periods=1, freq=None, axis=0, fill_value=None, suffix=None)
periods (int)
: The number of periods (positions) to shift.
Positive values shift data downwards (forward in time if working with time series).
Negative values shift data upwards (backward in time).
freq (str or DateOffset)
: This controls the frequency with which the index is shifted when working with time series data. Here are some common examples:
'D'
: Daily frequency
'B'
: Business day frequency
'M'
: Month end frequency
axis (int or str {'index', 'columns'})
: The axis along which to shift the data.
0
or 'index'
: Shift values along the rows.
1
or 'columns'
: Shift values along the columns.
fill_value (object)
: The value to fill the new entries created by the shift operation.
If not specified, missing values are typically represented by NaN
(Not a Number).
suffix(str, optional)
: If str
and periods are iterable, this is added after the column name and before the shift value for each shifted column name.
Here’s the example code of the shift()
method:
import pandas as pddata = {'Sales': [100, 150, 80, 200], 'Profit': [20, 30, 15, 40]}df = pd.DataFrame(data)df['Sales_prev'] = df['Sales'].shift(1)df['Profit_lag2'] = df['Profit'].shift(2)print(df)
In the above code:
Line 1: We import the pandas library as pd
.
Line 3: We create a dictionary named data
to store sample sales and profit figures as lists.
Line 4: We convert the data
dictionary into a pandas DataFrame named df
.
Line 6: We create a new column Sales_prev
in the DataFrame df
. This column contains the values of the Sales
column shifted down by one position (1 period
). It’s like having a column that shows the previous day’s sales.
Line 8: Another new column Profit_lag2
is added. This column contains the values of the Profit
column shifted up by two positions (2 periods
). It’s like having a column that shows the profit from two days ago.
Line 10: We print the entire DataFrame df
with the added columns. We’ll see the original Sales
and Profit
columns along with the new Sales_prev
and Profit_lag2
columns showing the shifted values.
By understanding the shift()
method and its applications, we can unlock new possibilities for data exploration and analysis in Python, particularly when working with time series data.
Free Resources