Bayes' Theorem with Lego
This post is part of our Guide to Bayesian Statistics and is now in book form in Bayesian Statistics the Fun Way!
What's a good blog on probability without a post on Bayes' Theorem? Bayes' Theorem is one of those mathematical ideas that is simultaneously simple and demanding. Its fundamental aim is to formalize how information about one event can give us understanding of another. Let's start with the formula and some lego, then see where it takes us.
Introducing Bayes' Theorem
Bayes' Theorem states:
$$P(A|B) = \frac{P(B|A)P(A)}{P(B)}$$
As far as formulae go this one isn't too scary, it doesn't even have a \(\Sigma\)! But what is actually happening here? Let's pull out some Lego bricks and put some concrete questions to our equation.
Lego Brick Probability Space
Here we have a 6 x 10 area of Lego bricks. This represents our probability space. In this space there are blue, red and yellow bricks. The yellow bricks rest on top of both the blue and red bricks. Let's make this situation more mathy by assigning some probabilities.
$$P(\text{blue}) = 40/60 = 2/3$$ Blue blocks take up 40 out of the 60 peg area (we're still counting the parts hidden by the yellow blocks).
$$P(\text{red}) = 20/60 = 1/3$$ Red blocks take up the remaining 20 out of 60 spaces.
It is important to note that $$P(blue) + P(red) = 1$$ This means that red and blue bricks alone can describe our entire set of possible events. But what about the yellow bricks?
Looking at the above picture we can see that if we pick a peg at random we'll have:
$$P(\text{yellow}) = 6/60 = 1/10$$
But we can't just add \(P(\text{yellow})\) to \(P(\text{red}) + P(\text{blue})\) or else we'd have something greater than 1! The yellow bricks always come with either a red or blue brick. The probability of getting a yellow brick is conditional on whether you are on a blue or red space. In probability theory we express these conditional probabilities as \( P(yellow | blue)\) which is stated "The probability of yellow given blue".
Working out conditional probabilities visually
Let's go back to our Lego and work out \( P(\text{yellow} | \text{red})\).
Visualizing Bayes' Theorem: Solving "Probability of yellow given red" with Lego.
Given the above picture let's walk through the process for determining \(P(\text{yellow}|\text{red})\):
- Split the red section off from the blue 
- Get the area of the remaining red space (2 x 10) 
- Get the area of the yellow block on the red space (4) 
- Divide the area of the yellow block by the area of the red block 
- \(P(\text{yellow}|\text{red})=4/20=1/5\) 
Great, we have arrived at the conditional probability of yellow given red! So far so good, but what if we just reverse that conditional probability, what is \(P(\text{red}|\text{yellow})\)? That is "If we know we are on a yellow space, what is the probability it's red underneath?"
Try to work out the probability that if you are on a yellow brick, there's a red brick underneath.
Looking at the picture above you may have easily figured out \(P(\text{red}|\text{yellow})\) by thinking "This is easy! There are 6 yellow pegs, 4 of them are over red so the probability of being over a red block if I'm on a yellow one is 4/6". If you did follow this line of thinking congratulations, you just independently discovered Bayes' Theorem!
Working through the math
Of course mathematical language is extremely concise, and human intuition is able to easily jump steps in its reasoning process; getting from our intuition to Bayes' Theorem will require a bit of work. Let's begin formalizing this intuition by coming up with a way to calculate "there are 6 yellow pegs." Our minds arrive at this conclusion through spatial reasoning, but we need to come up with a mathematical approach. To solve this we just take the probability of being on a yellow peg times the total number of pegs:
$$\text{numberOfYellowPegs} = P(\text{yellow}) \cdot \text{totalPegs} = 1/10 \cdot 60 = 6$$
The next part, "4 of them are red" requires a bit more work. First we have to establish how many red pegs there are, luckily this is the same as calculating yellow pegs.
$$\text{numberOfRedPegs} = P(\text{red}) \cdot \text{totalPegs} = 1/3 \cdot 60 = 20$$
We've also already figured out the ratio of how many of the red pegs are covered by yellow, it's \(P(yellow|red)\). To make this a count rather than a probability we just need to multiply it by the number of red pegs:
$$\text{numberOfRedUnderYellow} = P(\text{yellow}|\text{red})\cdot \text{numberOfRedPegs} = 1/5 \cdot 20 = 4$$
Finally we just need to get the ratio of the red pegs covered by yellow to the number of yellow and we get our answer.
$$P(\text{red}|\text{yellow}) = \frac{\text{numberOfRedUnderYellow}}{\text{numberOfYellowPegs}} = 4/6 = 2/3$$
This still doesn't quite look like Bayes' Theorem. To get there we'll have to go back and expand the terms in this equation.$$P(\text{red}|\text{yellow}) = \frac{P(\text{yellow}|\text{red}) \cdot \text{numberOfRedPegs}}{P(\text{yellow}) \cdot \text{totalPegs}}$$ $$P(\text{red}|\text{yellow}) = \frac{P(\text{yellow}|\text{red}) {P(\text{red}) \cdot \text{totalPegs}}}{P(\text{yellow}) \cdot \text{totalPegs}}$$ And finally cancelling out \(\text{totalPegs}\) from the equation we get $$P(\text{red}|\text{yellow}) = \frac{P(\text{yellow}|\text{red}) P(\text{red}) }{P(\text{yellow})}$$
From intuition we have arrived back at Bayes' Theorem!
Conclusion - What did we learn today?
The big takeaways from this experiment should be
- Conceptually, Bayes' Theorem follows from intuition. 
- The formalization of Bayes' Theorem is not necessarily as obvious. 
The benefit of all our mathematical work is that now we have extracted reason out of intuition. This both confirms that our original, intuitive beliefs are consistent and provides with a powerful new tool to deal with problems in probability that are more complicated than Lego bricks.
To learn more about Bayes' Theorem and Bayesian Reasoning checkout these posts:
- Learn about Bayesian Priors with Han Solo 
- Understand Bayes' Factor and Bayesian Reasoning by exploring a classic episode of the Twilight Zone 
- Use Bayes' Theorem to reason about the probability that your friends are really allergic to gluten 
This post now appears in book form in Bayesian Statistics the Fun Way!
If you enjoyed this post please subscribe to keep up to date and follow @willkurt!
 
             
             
            