Working with Time Series
Explore how to manipulate and work with time series data.
We'll cover the following
One thing to note when we say “time series” is that we’re not talking about the pandas Series object but rather data that has a date component. Often we’ll have that date component in the index of a pandas Series or DataFrame because that allows us to do time aggregations easily.
Loading the Data
For this section, we’re going to explore a dataset from the US Geologic Survey, which deals with the flow of a river in Utah called the Dirty Devil river. This data is a tab-delimited ASCII file and is described here in detail. The columns are:
agency_cd
: Agency collecting data.site_no
: USGS identification number of site.datetime
: Date.tz_cd
: Timezone.144166_00060
: Discharge (cubic feet per second).144166_00060_cd
: Status of discharge. “A” (approved), “P” (provisional), “e” (estimate).144167_00065
: Gauge height (feet).144167_00065_cd
: Status of gauge_height. “A” (approved), “P” (provisional), “e” (estimate).
Here’s the code to load the data. We’ve also included a tweak function that converts the date information to actual dates and renames some columns. Note that the file is not a CSV file, but we can specify tab as a separator. Also, we need to skip a few of the rows:
Get hands-on with 1400+ tech skills courses.