Get started with the basics of analytics in Python, the language of choice for data science.

Python Data Analysis and Visualization

Get started with the basics of analytics in Python, the language of choice for data science using Python data structures,  descriptive statistics, and much more.

Data Analysis and Visualization

## Introduction to CSV file

Comma-separated files (CSV) are common in machine learning. These files have a **row** of data per line of the file and each line is a _comma-separated list_ in which each element is a **column**. Pandas makes it easy to read this data.

## Reading CSV file with Pandas

The documentation can be found [here](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_csv.html). Before reading a CSV file, there are three parameters that should be known:

* **`sep`** - this defaults to a comma, but we can specify anything we want. For example, CSV format is poor if some of your columns contain commas. A better option might be a |.
* **`header`** - which row (if any) have the column names.
* **`names`** - column names to use.

If your CSV is well-formatted where the first row is the column names, then the default parameters should work well.

It is important to note that while it might sound simple to read in a CSV file without Pandas, CSV files are often very messy and reading them appropriately can often consist of handling many edge cases. Pandas module handles many of those edge cases right out of the box and has many parameters that you can change to handle messier CSV files.

Let's see an example with code.

# Introduction to CSV file

Comma-separated files (CSV) are common in machine learning. These files have a **row** of data per line of the file and each line is a _comma-separated list_ in which each element is a **column**. Pandas makes it easy to read this data.

# Reading CSV file with Pandas

The documentation can be found [here](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_csv.html). Before reading a CSV file, there are three parameters that should be known:

* **`sep`** - this defaults to a comma, but we can specify anything we want. For example, CSV format is poor if some of your columns contain commas. A better option might be a |.
* **`header`** - which row (if any) have the column names.
* **`names`** - column names to use.

If your CSV is well-formatted where the first row is the column names, then the default parameters should work well.

It is important to note that while it might sound simple to read in a CSV file without Pandas, CSV files are often very messy and reading them appropriately can often consist of handling many edge cases. Pandas module handles many of those edge cases right out of the box and has many parameters that you can change to handle messier CSV files.

Let's see an example with code.

This lesson focuses on CSV type files. It gives a complete explanation about how to read data from CSV files using the Pandas library of Python.

What is Analytics

Python Basics for Analytics

Describing Data

Cleaning Data

Visualizing Data

Comma Separated Files

Introduction to CSV file #

Reading CSV file with Pandas #