...

The Dataset and Exploratory Data Analysis

Learn how to read the dataset and perform exploratory data analysis.

We'll cover the following...

Dataset
- Data dictionary
Exploratory data analysis
- Visualize the missing data
- Know more about the data

Let's explore one of the most famous and benchmark datasets of the Titanic disaster history. This dataset is considered a first step toward classification in machine learning.

Dataset

In the Titanic dataset, we have the following features. We want to predict if the passenger survived or not. Therefore, the target will be the Survived column.

Data dictionary

PassengerId: Passenger ID
Pclass: Ticket class, where 1 = 1st, 2 = 2nd, and 3 = 3rd
Name: Passenger name
Sex: Male/femaleAge: Age in years
SibSp: Number of siblings and/or spouses aboard the Titanic
Parch: Number of parents and/or children aboard the Titanic
Ticket: Ticket number
Fare: Passenger fare
Cabin: Cabin number
Embarked: Port of embarkation, where C = Cherbourg, Q = Queenstown, and S = Southampton
Survived: 0 = No, and 1 = Yes

The goal here is to predict if a passenger survived ...

Course Introduction

Linear Regression

Regularization

Bias-Variance Trade-off

Categorical Features

Logistic Regression

Logistic Regression: Titanic Data

Sentiment Analysis Using Multinomial Logistic Regression

Multiclass Classification and Handling Imbalanced Classes

Project: Predicting Chronic Kidney Disease

K-Nearest Neighbors

Implementation of K-Nearest Neighbors

Logistic Regression vs. KNN

Decision Tree Learning

Implement the Decision Tree Classifier from Scratch

Bootstrapping and Confidence Interval

Support Vector Machine

Practice and Comparisons

What's Next?

Appendix

The Dataset and Exploratory Data Analysis

Dataset

Data dictionary