Describing a Series
Learn how to describe a series using SQL.
We'll cover the following...
Overview
When we get a fresh data set, the first thing we usually want to do is get familiar with it. Some call this exploratory data analysis (EDA). SQL provides functions to produce descriptive statistics.
A series is a one-dimensional list of values. For example, we can use the VALUES
to produce a simple series:
SELECT * FROM (VALUES (1), (2), (3)) AS t(n)
This query uses VALUES
to generate a list of numbers: 1, 2, and 3.
Another way of thinking of a series is as a column in a table. For example, if we have a “sales” table, a “sale_date” column can be thought of as a series of timestamps. A “charged_amount” column can be thought of as a series of numbers.
In this lesson, we’ll learn about aggregate functions that can help us get familiar with the data in a series.
Counting rows
The first thing we always want to know is how many rows are there. In SQL, we can answer this question with the aggregate function COUNT
:
WITH series AS (SELECT * FROM (VALUES (1), (2), (3)) AS t(n))SELECTCOUNT(*)FROMseries;
To count the number of rows in a series, we can use the aggregate function COUNT(*)
.
Minimum and maximum values
When working on a series of data, it is useful to know the range of the values. For example, if we have a series of dates, we want to know the first and last dates ...