Getting Familiar with Data and Performing Data Cleaning
Learn about the data and steps involved in data cleaning.
We'll cover the following...
Getting familiar with data
In your work as a data scientist, there are several possible scenarios in which you may receive such a dataset. These include the following:
-
You created the SQL query that generated the data.
-
A colleague wrote a SQL query for you, with your input.
-
A colleague who knows about the data gave it to you, but without your input.
-
You are given a dataset about which little is known.
In cases 1 and 2, your input was involved in generating/extracting the data. In these scenarios, you probably understood the business problem and then either found the data you needed with the help of a data engineer or did your own research and designed the SQL query that generated the data. Often, especially as you gain more experience in your data science role, the first step will be to meet with the business partner to understand and refine the mathematical definition of the business problem. Then, you would play a key role in defining what is in the ...