The Dataset and Exploratory Data Analysis
Learn how to read the dataset and perform exploratory data analysis.
Let's explore one of the most famous and benchmark datasets of the Titanic disaster history. This dataset is considered a first step toward classification in machine learning.
Dataset
In the Titanic dataset, we have the following features. We want to predict if the passenger survived or not. Therefore, the target will be the Survived
column.
Data dictionary
PassengerId
: Passenger IDPclass
: Ticket class, where1
= 1st,2
= 2nd, and3
= 3rdName
: Passenger nameSex
: Male/femaleAge: Age in yearsSibSp
: Number of siblings and/or spouses aboard the TitanicParch
: Number of parents and/or children aboard the TitanicTicket
: Ticket numberFare
: Passenger fareCabin
: Cabin numberEmbarked
: Port of embarkation, whereC
= Cherbourg,Q
= Queenstown, andS
= Southampton ...