“A Bayesian network signifies the causal probabilistic connection among a set of random variables, their conditional dependencies, and it provides a compact representation of a joint probability distribution.” (Murphy, 1998).
A Bayesian network consists of two essential parts: a directed acyclic graph and a set of conditional probability distributions.
Bayesian networks formulate on the same intuition as the Naïve Bayes classifier, but in contradiction to Naïve Bayes, Bayesian networks are not limited to only representing independent features. Bayesian networks enable us to add as many dependencies that seem consistent.
For clarification, consider the following example.
Assume we try to turn on our computer, but the computer does not start (observation/evidence). We want to understand which of the viable causes of computer failure is more likely. This simplified illustration only assumes two possible causes of this misfortune: electricity failure and computer malfunction.
The two causes in this example are considered independent because there is no edge between them, but this assumption is not generally necessary. Unless there is a cycle in the graph, Bayesian networks can capture as many causal relations as is essential to credibly describe the real-life situation.
The aim is to estimate the posterior conditional probability distribution of viable unobserved causes given the empirical evidence, i.e., P [Cause | Evidence].
The entire idea of Bayesian networks is formed on the Bayes theorem, which helps us to signify the conditional probability distribution of cause, provided the experimental evidence using the converse conditional probability of observing evidence given the reason:
Given a node’s parents, any node in a Bayesian network is conditionally independent of all nondescendants. Therefore, the joint probability distribution of all random variables in the graph factorizes into a sequence of conditional probability distributions of random variables given their parents.
Here’s the practical implementation of Bayesian networks.
First, let’s look at how to initialize a Bayesian network by quickly implementing the Monty Hall Problem. The Monty Hall problem arose from the game show Let’s Make a Deal, where a guest had to pick which one of three doors had a reward behind it. The twist was that after the guest chose, the host (originally Monty Hall) would then open one of the doors the guest did not pick and ask if the guest wanted to switch which door they had chosen.
To create the Bayesian network in pomegranate
, we first design the distributions that live in each node in the graph. For a discrete Bayesian network, we use Discrete Distribution objects for the root nodes and Conditional Probability Table objects for the inner and leaf nodes. The columns in a Conditional Probability Table correspond to the order in which the parents (the second argument) are specified. The last column is the value the Conditional Probability Table takes itself. In the case below, the first column corresponds to the value guest
takes, then the value prize
takes, and then the matter that monty
takes. B
, C
, and A
refer to the probability that Monty reveals door A
, given that the guest has chosen door B
, and that the prize is actually behind door C
, or P(Monty=A
|Guest=B
, Prize=C
).
from pomegranate import *# The guests initial door selection is completely randomguest = DiscreteDistribution({'A': 1./3, 'B': 1./3, 'C': 1./3})# The door the prize is behind is also completely randomprize = DiscreteDistribution({'A': 1./3, 'B': 1./3, 'C': 1./3})# Monty is dependent on both the guest and the prize.monty = ConditionalProbabilityTable([['A', 'A', 'A', 0.0],['A', 'A', 'B', 0.5],['A', 'A', 'C', 0.5],['A', 'B', 'A', 0.0],['A', 'B', 'B', 0.0],['A', 'B', 'C', 1.0],['A', 'C', 'A', 0.0],['A', 'C', 'B', 1.0],['A', 'C', 'C', 0.0],['B', 'A', 'A', 0.0],['B', 'A', 'B', 0.0],['B', 'A', 'C', 1.0],['B', 'B', 'A', 0.5],['B', 'B', 'B', 0.0],['B', 'B', 'C', 0.5],['C', 'B', 'A', 1.0],['C', 'B', 'B', 0.0],['C', 'B', 'C', 0.0],['C', 'C', 'A', 0.5],['C', 'C', 'B', 0.5],['C', 'C', 'C', 0.0],['B', 'C', 'A', 1.0],['B', 'C', 'B', 0.0],['B', 'C', 'C', 0.0],['C', 'A', 'A', 0.0],['C', 'A', 'B', 1.0],['C', 'A', 'C', 0.0],],[guest, prize])# State objects hold both the distribution, and a high level name.s1 = Node(guest, name="guest")s2 = Node(prize, name="prize")s3 = Node(monty, name="monty")# Create the Bayesian network object with a useful namemodel = BayesianNetwork("Monty Hall Problem")# Add the three states to the networkmodel.add_states(s1, s2, s3)# Add edges which represent conditional dependencies, where the second node is# conditionally dependent on the first node (Monty is dependent on both guest and prize)model.add_edge(s1, s3)model.add_edge(s2, s3)# Finding the probalibitiesmodel.bake()# Probability for Valid Caseprint("The Probability if User said door A, then Monty opened door B, but the car was behind door C : ",model.probability([['A', 'B', 'C']]))# Probability for Invalid Caseprint("The Probability if User said door A, then Monty opened door B, but the car was behind door B : ",model.probability([['A', 'B', 'B']]))
Free Resources