Pre-Processing
Learn how we can calculate the ticket class modifier, the gender modifier, and pre-processing.
We'll cover the following...
The pre-processing covers the calculation of the modifiers. We start with the ticket class.
Calculating the ticket class modifier
# get the modifier given the passenger's pclassdef get_modifier_pclass(pclass):# number of passengers with the same pclasscnt_surv_pclass = len(survivors[survivors.Pclass.eq(pclass)])# backward probabilityp_cl_surv = cnt_surv_pclass/cnt_survivors# probability of the evidencep_cl = len(train[train.Pclass.eq(pclass)])/cnt_allreturn p_cl_surv/p_cl
We define a function that takes the passenger’s pclass
as input. The Pclass
column in our dataset is the ticket class (1 = 1st, 2 = 2nd, 3 = 3rd).
We calculate the backward probability by dividing the passengers who survived having the given ticket class (cnt_surv_pclass
) in line 4 by all survivors (cnt_survivors
) in line 7. Then, we calculate the probability of a passenger owning the given ticket class. The number of passengers with the given ticket class is divided by the total number of passengers in line 10.
The modifier is the evidence’s backward probability divided by the likelihood to see the evidence. For the given ticket class, the modifier is ...