Estimating a Variable
Learn how to estimate a variable.
We'll cover the following...
Let’s get back to our quantum Bayesian network, which consists of four nodes. The Age
and Sex
of a passenger determine the Norm
. The Norm
and the Pclass
determine Survival
.
Our data consists of all the cases of passengers onboard the Titanic. The dataset contains observations of Age
, Sex
, and Survival
. These are observable variables. The values of the Norm
are missing data. The Norm
is a hidden variable.
The image above depicts the missing CPT of our Bayesian network.
We aim to find the CPTs that maximize the probability of the observed data.
Rather than writing a single big function, we split our code into small pieces we can put together at the end. Let’s start with the marginal probabilities of being a child (isChild
) and a passenger’s gender (Sex
).
Applying the known
import pandas as pdtrain = pd.read_csv('train.csv')# the maximum age of a passenger we consider as a childmax_child_age = 8# probability of being a childpopulation_child = train[train.Age.le(max_child_age)]p_child = len(population_child)/len(train)# probability of being femalepopulation_female = train[train.Sex.eq("female")]p_female = len(population_female)/len(train)# positions of the qubitsQPOS_ISCHILD = 0QPOS_SEX = 1def apply_ischild_sex(qc):# set the marginal probability of isChildqc.ry(prob_to_angle(p_child), QPOS_ISCHILD)# set the marginal probability of Sexqc.ry(prob_to_angle(p_female), QPOS_SEX)
In line 5, we’ll keep the maximum age of 8 years of a passenger we consider as a child. The probability of being a child is given by the number of children in line 8, divided by the total number of passengers in line 9.
We’ll do the same calculation for the passenger being ...