DataFrame
Let’s learn about grabbing columns, rows, and specific elements from a DataFrame.
A very simple way to think about a DataFrame is as a bunch of Series that share the same index. A DataFrame is a rectangular table of data that contains an ordered collection of columns, each of which can be a different value type (numeric, string, boolean, and so on). A DataFrame has both rows and column indexes. It can be thought of as being a dictionary of Series, all of which share the same index (any row or column).
Let’s create a few DataFrames to learn more about them.
For our DataFrame, we’ll create two labels or indexes:
- Our index will be for rows
r1
tor10
. - Our columns will be for columns
c1
toc10
.
In the code below, we’ll use split()
to create a list and then use arange()
and reshape()
together to create a 2-D array (matrix).
import pandas as pdimport numpy as npindex = 'r1 r2 r3 r4 r5 r6 r7 r8 r9 r10'.split()columns = 'c1 c2 c3 c4 c5 c6 c7 c8 c9 c10'.split()# just to see what the index looks like, a list from r1 to r10!print(index)# and what columns look like, a list from c1 to c10!print(columns)array_2d = np.arange(0,100).reshape(10,10) # creating a 2D array "array_2d"print(array_2d)
Now, let’s create our first DataFrame using index
, columns
, and array_2d
.
df = pd.DataFrame(data = array_2d, index = index, columns = columns)print(df)
Our first data frame is df
. We have columns c1
to c10
and their corresponding rows r1
to r10
. Each column is actually a pandas Series, sharing a common index of row labels.
Now, we can play with df
to learn how to grab the data that we need—one of the most important concepts we’ll cover in this course
Columns
Grabbing columns from DataFrame
To grab a column from a DataFrame, we simply pass the name of the required column in square brackets.
# Grabbing a single columnprint("Column C1 : ", df['c1'])
Notice that the above output ...