Statistics can be quite bewildering. Consider the following problem:
It is given that if a person having a disease takes a diagnostic test for the disease, the test returns a positive result 99% of the time, or with a probability of 0.99. Now, for some person picked at random, if the test returns a positive result, what is the probability that s/he has the disease?
You might think that the probability is of course 0.99. But of course that isn’t so. If you did reach the naive conclusion, don’t worry: a lot of eminent scientists and doctors have been seen doing the same mistake (try it with your doctor!)
Actually, there isn’t enough information to give an answer to that question. To be able to get to the answer, we need to know how rare the disease is. Let’s add in this information and work out the answer.
Let’s say it is also known that 5 people in every 1000 suffer from this disease. Put another way, if you pick a person at random, the probability that s/he has the disease is 0.005. Let’s also say that if a person who does not have the disease takes the test, s/he tests positive 2% of the time, or with a probability of 0.02.
Now let’s get down to business.
If we pick a person at random,
let A be the event that the person has the disease, P(A) = 0.005
let B be the event that the person tests positive for the disease, P(B) = ?
let A’ be the event that the person does not have the disease, P(A’) = 1 – P(A) = 0.995
it is given that if a person has a disease, s/he will test positive with a probability 0.99, P(B | A) = 0.99
it is given that if a person does not have the disease, s/he will test positive with a probability 0.02, P(B | A’) = 0.02
We are looking to find P(A | B).
From Bayes’ rule,
P(A | B) = ( P(A) * P(B | A) ) / P(B)
We know everything on the right hand side except P(B). We can calculate P(B) as follows:
P(B) = P(the person has the disease and tests positive) + P(the person does not have the disease and tests positive)
or P(B) = P(A) * P(B | A) + P(A’) * P(B | A’)
or P(B) = 0.005 * 0.99 + 0.995 * 0.02 = 0.02485
Now let’s get P(A | B).
P(A | B) = ( P(A) * P(B | A) ) / P(B)
P(A | B) = (0.005 * 0.99) / 0.02485 = 0.199
That’s less than 2% chance that the person actally has the disease given s/he tests positive for it. And you thought it was 99%!
2 Comments
Imagine scaring your doctor with Baye’s theorem when he’s scaring you with a medical report …
This is proof to what Cory Doctrow’s been saying. Nice.
One Trackback
[...] – bookmarked by 1 members originally found by rafitzg on July 20, 2008 More Bayes’ magic http://www.grok.in/blog/2008/05/04/more-bayes-magic/ – bookmarked by 3 members originally found by [...]