...
/The Convenience Function and Probabilities for Relatives
The Convenience Function and Probabilities for Relatives
Learn the concept of convenience functions and how to calculate the probabilities of relatives to have survived the Titanic disaster.
How familial relationships affect survival
Let’s get back to the Titanic. There are still plenty of ways to improve our QBN. A promising feature to include is the relationships between passengers. So far, we’ve ignored any family relationship. Our dataset contains information about the number of siblings and spouses (SibSp
) and the number of parents and children (Parch
) traveling with a passenger.
The SibSp
and Parch
fields are numeric values denoting the number of related passengers aboard the Titanic.
The following function lets us evaluate how a certain number of related passengers affects the chance to survive.
Convenience function to evaluate the effect of a relationship
def evaluate_relation(relation, value):# separate the populationpopulation = train[train[relation].eq(value)] if value < 2 else train[train[relation].ge(value)]p = len(population)/len(train)# chance to survivesurv = population[population.Survived.eq(1)]p_surv = len(surv)/len(population)return (p, p_surv)
The function evaluate_relation
takes two parameters, the name of the relation
and the value
. We’ll start by separating the population from our training dataset in line 3. If the provided value
is smaller than 2
, we select all passengers with this exact value
for the given relation
. We summarize all passengers with a value
that is greater or equal to 2
. The marginal probability of having a certain number of related passengers is given by the size of the selected ...