DataFrame

Let’s learn about grabbing columns, rows, and specific elements from a DataFrame.

A very simple way to think about a DataFrame is as a bunch of Series that share the same index. A DataFrame is a rectangular table of data that contains an ordered collection of columns, each of which can be a different value type (numeric, string, boolean, and so on). A DataFrame has both rows and column indexes. It can be thought of as being a dictionary of Series, all of which share the same index (any row or column).

Let’s create a few DataFrames to learn more about them.

For our DataFrame, we’ll create two labels or indexes:

  • Our index will be for rows r1 to r10.
  • Our columns will be for columns c1 to c10.

In the code below, we’ll use split() to create a list and then use arange() and reshape() together to create a 2-D array (matrix).

Press + to interact
import pandas as pd
import numpy as np
index = 'r1 r2 r3 r4 r5 r6 r7 r8 r9 r10'.split()
columns = 'c1 c2 c3 c4 c5 c6 c7 c8 c9 c10'.split()
# just to see what the index looks like, a list from r1 to r10!
print(index)
# and what columns look like, a list from c1 to c10!
print(columns)
array_2d = np.arange(0,100).reshape(10,10) # creating a 2D array "array_2d"
print(array_2d)

Now, let’s create our first DataFrame using index, columns, and array_2d.

Press + to interact
df = pd.DataFrame(data = array_2d, index = index, columns = columns)
print(df)

Our first data frame is df. We have columns c1 to c10 and their corresponding rows r1 to r10. Each column is actually a pandas Series, sharing a common index of row labels.

Now, we can play with df to learn how to grab the data that we need—one of the most important concepts we’ll cover in this course

Columns

Grabbing columns from DataFrame

To grab a column from a DataFrame, we simply pass the name of the required column in square brackets.

Press + to interact
# Grabbing a single column
print("Column C1 : ", df['c1'])

Notice that the above output ...