...

/

Fixing the Columns

Fixing the Columns

Learn the steps to fix the columns of a dataset.

Understanding the dataset's columns

As a first step when cleaning data, we retrieve the columns and apply standard data wrangling techniques. The goal is to ensure column names are easy to read and reference later during analysis.

Press + to interact
main.py
employees.csv
NAME,CITY,COUNTRY, HEIGHT ,WEIGHT, ACCOUNT A,ACCOUNT B,TOTAL ACCOUNT
Kevin Hart,MELBOURNE,AUSTRALIA,57,134,2392,4342,6734
Judith Elliot,MANCHESTER,UNITED KINGDOM,61,167,4502,34334,38836
Lydia Carrasco,Oslo,Norway,56,119,,5505,8950
Jane Mattew,AMSTERDAM,NEDERLANDS,59,123,4346,9000,400
Von Gard,Berlin,GERMANYY,,127,7002,19002,26004
Juio Hernade,Mexico City,MEXICO,67,168,5000,4000,3452
Lydia Carrasco,Oslo,Norway,,119,3445,5505,8950
Judith Elliot,MANCHESTER,UNITED KINGDOM,61,,4500,2300,6800
Juio Hernade,Mexico City,MEXICO,67,168,5000,4000,3452
Judith Elliot,MANCHESTER,UNITED KINGDOM,61,167,4502,34334,38836
Lydia Carrasco,Oslo,Norway,56,119,,5505,8950

Let’s review the code line by line:

  • Line 1: We import the pandas library.

  • Line 2: We load the employees.csv dataset.

  • Line 3: We retrieve column names from a DataFrame using the columns property and print them using the print() function.

As we can see, the output comprises a list of the DataFrame column names. We can also see that the columns HEIGHT, WEIGHT, and ACCOUNT A have spaces as part of the column names. We'll remove these spaces in the next section. ...