More Bayes’ magic

Statistics can be quite bewildering. Consider the following problem:

It is given that if a person having a disease takes a diagnostic test for the disease, the test returns a positive result 99% of the time, or with a probability of 0.99. Now, for some person picked at random, if the test returns a positive result, what is the probability that s/he has the disease?

You might think that the probability is of course 0.99. But of course that isn’t so. If you did reach the naive conclusion, don’t worry: a lot of eminent scientists and doctors have been seen doing the same mistake (try it with your doctor!)

Actually, there isn’t enough information to give an answer to that question. To be able to get to the answer, we need to know how rare the disease is. Let’s add in this information and work out the answer.

Let’s say it is also known that 5 people in every 1000 suffer from this disease. Put another way, if you pick a person at random, the probability that s/he has the disease is 0.005. Let’s also say that if a person who does not have the disease takes the test, s/he tests positive 2% of the time, or with a probability of 0.02.

Now let’s get down to business.

If we pick a person at random,

let A be the event that the person has the disease, P(A) = 0.005

let B be the event that the person tests positive for the disease, P(B) = ?

let A’ be the event that the person does not have the disease, P(A’) = 1 – P(A) = 0.995

it is given that if a person has a disease, s/he will test positive with a probability 0.99, P(B | A) = 0.99

it is given that if a person does not have the disease, s/he will test positive with a probability 0.02, P(B | A’) = 0.02

We are looking to find P(A | B).

From Bayes’ rule,

P(A | B) = ( P(A) * P(B | A) ) / P(B)

We know everything on the right hand side except P(B). We can calculate P(B) as follows:

P(B) = P(the person has the disease and tests positive) + P(the person does not have the disease and tests positive)

or P(B) = P(A) * P(B | A) + P(A’) * P(B | A’)

or P(B) = 0.005 * 0.99 + 0.995 * 0.02 = 0.02485

Now let’s get P(A | B).

P(A | B) = ( P(A) * P(B | A) ) / P(B)

P(A | B) = (0.005 * 0.99) / 0.02485 = 0.199

That’s less than 2% chance that the person actally has the disease given s/he tests positive for it. And you thought it was 99%!

This entry was posted in Probability & Statistics and tagged . Bookmark the permalink. Post a comment or leave a trackback: Trackback URL.

2 Comments

  1. Posted May 19, 2008 at 3:32 pm | Permalink

    Imagine scaring your doctor with Baye’s theorem when he’s scaring you with a medical report …

  2. Krishna
    Posted May 21, 2008 at 11:08 am | Permalink

    This is proof to what Cory Doctrow’s been saying. Nice.

One Trackback

  1. By Bookmarks about Bayes on July 31, 2008 at 5:00 pm

    [...] – bookmarked by 1 members originally found by rafitzg on July 20, 2008 More Bayes’ magic http://www.grok.in/blog/2008/05/04/more-bayes-magic/ – bookmarked by 3 members originally found by [...]

  • About grok.in

    This is a blog primarily focussed on the subjects of Information Engineering—Retrieval, Extraction & Management, Machine Learning, Scalability and Cloud Computing.