<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>The art of Information Engineering &#187; Probability &amp; Statistics</title>
	<atom:link href="http://www.grok.in/blog/cats/probability-statistics/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.grok.in</link>
	<description>(ignorance killed the cat, curiosity was framed)</description>
	<lastBuildDate>Tue, 11 Aug 2009 06:30:18 +0000</lastBuildDate>
	
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>More Bayes&#8217; magic</title>
		<link>http://www.grok.in/blog/2008/05/04/more-bayes-magic/</link>
		<comments>http://www.grok.in/blog/2008/05/04/more-bayes-magic/#comments</comments>
		<pubDate>Sat, 03 May 2008 18:49:26 +0000</pubDate>
		<dc:creator>Siddhartha Reddy</dc:creator>
				<category><![CDATA[Probability & Statistics]]></category>
		<category><![CDATA[bayes]]></category>

		<guid isPermaLink="false">http://www.grok.in/?p=47</guid>
		<description><![CDATA[Statistics can be quite bewildering. Consider the following problem:
It is given that if a person having a disease takes a diagnostic test for the disease, the test returns a positive result 99% of the time, or with a probability of 0.99. Now, for some person picked at random, if the test returns a positive result, [...]]]></description>
			<content:encoded><![CDATA[<p>Statistics can be quite bewildering. Consider the following problem:</p>
<blockquote><p>It is given that if a person having a disease takes a diagnostic test for the disease, the test returns a positive result 99% of the time, or with a probability of 0.99. Now, for some person picked at random, if the test returns a positive result, what is the probability that s/he has the disease?</p></blockquote>
<p>You might think that the probability is <em>of course </em>0.99. But <em>of course</em> that isn&#8217;t so. If you did reach the naive conclusion, don&#8217;t worry: a lot of eminent scientists and doctors have been seen doing the same mistake (try it with your doctor!)</p>
<p><span id="more-47"></span></p>
<p>Actually, there isn&#8217;t enough information to give an answer to that question. To be able to get to the answer, we need to know how rare the disease is. Let&#8217;s add in this information and work out the answer.</p>
<p>Let&#8217;s say it is also known that 5 people in every 1000 suffer from this disease. Put another way, if you pick a person at random, the probability that s/he has the disease is 0.005. Let&#8217;s also say that if a person who does not have the disease takes the test, s/he tests positive 2% of the time, or with a probability of 0.02.</p>
<p>Now let&#8217;s get down to business.</p>
<blockquote><p>If we pick a person at random,</p>
<p>let A be the event that the person has the disease, P(A) = 0.005</p>
<p>let B be the event that the person tests positive for the disease, P(B) = ?</p>
<p>let A&#8217; be the event that the person does not have the disease, P(A&#8217;) = 1 &#8211; P(A) = 0.995</p>
<p>it is given that if a person has a disease, s/he will test positive with a probability 0.99, P(B | A) = 0.99</p>
<p>it is given that if a person does not have the disease, s/he will test positive with a probability 0.02, P(B | A&#8217;) = 0.02</p>
<p>We are looking to find P(A | B).</p>
<p>From Bayes&#8217; rule,</p>
<p>P(A | B) = ( P(A) * P(B | A) ) / P(B)</p></blockquote>
<p>We know everything on the right hand side except P(B). We can calculate P(B) as follows:</p>
<blockquote><p>P(B) = P(the person has the disease and tests positive) + P(the person does not have the disease and tests positive)</p>
<p>or P(B) = P(A) * P(B | A) + P(A&#8217;) * P(B | A&#8217;)</p>
<p>or P(B) = 0.005 * 0.99 + 0.995 * 0.02 = 0.02485</p></blockquote>
<p>Now let&#8217;s get P(A | B).</p>
<blockquote><p>P(A | B) = ( P(A) * P(B | A) ) / P(B)</p>
<p>P(A | B) = (0.005 * 0.99) / 0.02485 = 0.199</p></blockquote>
<p>That&#8217;s less than 2% chance that the person actally has the disease given s/he tests positive for it. And you thought it was 99%!</p>
]]></content:encoded>
			<wfw:commentRss>http://www.grok.in/blog/2008/05/04/more-bayes-magic/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Probabilities, huh!</title>
		<link>http://www.grok.in/blog/2008/05/03/probabilities-huh/</link>
		<comments>http://www.grok.in/blog/2008/05/03/probabilities-huh/#comments</comments>
		<pubDate>Sat, 03 May 2008 08:05:54 +0000</pubDate>
		<dc:creator>Siddhartha Reddy</dc:creator>
				<category><![CDATA[Probability & Statistics]]></category>
		<category><![CDATA[bayes]]></category>
		<category><![CDATA[probability]]></category>

		<guid isPermaLink="false">http://www.grok.in/?p=46</guid>
		<description><![CDATA[sanket asked a very interesting question in the comments to my previous post on Monty Hall Problem:
Assume that boys and girls are equally likely to be born. Let us say that a family has two children. Given that one of them is a boy, what is the probability that the other one is a boy [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://1stprinciples.wordpress.com/">sanket</a> asked a very interesting question in the comments to my previous <a title="grok.in - Monty Hall Problem" href="http://www.grok.in/blog/2008/04/21/to-switch-or-not-to-switch-that-is-the-question/">post on Monty Hall Problem</a>:</p>
<blockquote><p>Assume that boys and girls are equally likely to be born. Let us say that a family has two children. Given that one of them is a boy, what is the probability that the other one is a boy too? (Source: One of Scott Aaronson&#8217;s (http://scottaaronson.com/blog/) lecture notes.)</p></blockquote>
<p><strong>Update</strong>: It turns out that after stating this problem this way here, I solved a different problem altogether. Thanks, Nikhil, for pointing it out in the comments. To keep my life simple, I&#8217;ll state a modified problem below &#8212; the one that I <em>did</em> solve.</p>
<blockquote><p>Assume that boys and girls are equally likely to be born. Let us say that a family has two children. Given that one of them is a boy, what is the probability that the other one is a girl?</p></blockquote>
<p>Most people would jump out with 1/2 as the answer. Of course, if the answer was that obvious the question wouldn&#8217;t exist. The answer is 2/3. Here I will describe two different ways of arriving at this, as well as the common mistake that leads people to 1/2.</p>
<p><span id="more-46"></span></p>
<p>I will start by defining the problem in more formal terms. We are doing an experiment of giving birth to two children. Let GG, GB, BG and GG denote the outcomes that the children born are girl-girl, girl-boy, boy-girl and girl-girl respectively.</p>
<blockquote><p>So the sample space S is, S = { GG, GB, BG, BB }</p>
<p>Since each outcome is equally likely, the probabilities of the outcomes, P(GG) = P(GB) = P(BG) = P(BB) = 1/4</p>
<p>We are given that one of the child is a boy.</p>
<p>We are interested in the event E that the other child is a girl, E = { GB, BG }</p></blockquote>
<p>But before seeing the correct solutions, let us see one of the the <em><strong>flawed</strong></em> solutions that generally leads people to the conclusion that the the probability is 1/2.</p>
<blockquote><p>S = { GG, GB, BG, BB }</p>
<p>P(GG) = P(GB) = P(BG) = P(BB) = 1/4</p>
<p>E = { GB, BG }</p>
<p>P(E) = P(GB) + P(BG) = 1/4 + 1/4 = 1/2</p>
<p>Voila! Of course this is <em><strong>wrong</strong></em>, the first of correct solutions below should make it clear why it is so.</p></blockquote>
<p>Now the <em><strong>correct</strong></em> solutions:</p>
<p>1. Let&#8217;s just count.</p>
<blockquote><p>S = { GG, GB, BG, BB }</p>
<p>Since we are given that one of the children is a boy, our <em>sample space reduces</em> to</p>
<p>S&#8217; = { GB, BG, BB}</p>
<p>See how GG is conspicous by its absence? That is because the outcome GG cannot occur (wow, a boy cannot be girl!)</p>
<p>In this reduced sample space, our probabilities become</p>
<p>P(GB) = P(BG) = P(BB) = 1/3</p>
<p>E = { GB, BG }</p>
<p>P(E) = P(GB) + P(BG) = 1/3 + 1/3 = 2/3</p>
<p>Huh!</p></blockquote>
<p>2. Using Bayes&#8217; Rule.</p>
<blockquote><p>S = { GG, GB, BG, BB }</p>
<p>P(GG) = P(GB) = P(BG) = P(BB) = 1/4</p>
<p>E = { GB, BG }</p>
<p>Let F be the event that at least one of the child is a boy, F = { GB, BG, BB }</p>
<p>We are looking for probability of E given F, P(E | F)</p>
<p>Using Bayes&#8217; rule,</p>
<p>P(E | F) = (P(E) * P(F | E)) / P(F)</p>
<p>Now, P(E) = P(GB) + P(BG) = 1/4 + 1/4 = 1/2</p>
<p>P(F) = P(GB) + P(BG) + P(BB) = 1/4 + 1/4 + 1/4 = 3/4</p>
<p>Because E is the even that there is one boy and one girl and F is the even that there is at least one boy, P(F | E) = 1</p>
<p>So, P(E | F) = (1/2 * 1) / (3/4) = 2/3</p></blockquote>
<p>The observant might notice that we did not reduce the sample space in second solution. This is no trick: we <em>could</em> reduce the sample space and would still end up with the same answer; try it yourself. But there is no <em>need</em> to do any such thing when using the Bayes&#8217; rule because the fact that the sample space is actually reduced is inherently captured. In fact, many a time it is not possible to simply &#8220;reduce the sample space and proceed&#8221; &#8212; Bayes&#8217; rule really shines then.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.grok.in/blog/2008/05/03/probabilities-huh/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>To switch or not to switch, that is the question</title>
		<link>http://www.grok.in/blog/2008/04/21/to-switch-or-not-to-switch-that-is-the-question/</link>
		<comments>http://www.grok.in/blog/2008/04/21/to-switch-or-not-to-switch-that-is-the-question/#comments</comments>
		<pubDate>Mon, 21 Apr 2008 18:27:52 +0000</pubDate>
		<dc:creator>Siddhartha Reddy</dc:creator>
				<category><![CDATA[Probability & Statistics]]></category>
		<category><![CDATA[paradox]]></category>
		<category><![CDATA[problem]]></category>

		<guid isPermaLink="false">http://www.grok.in/?p=45</guid>
		<description><![CDATA[I recently came across a very interesting problem known as &#8220;The Monty Hall Problem.&#8221; This is a statistical puzzle named after the host of an old television show &#8220;Let&#8217;s Make a Deal&#8221; which featured a similar problem albeit a little more involved than the basic version that mathematicians use. Here is a simple description of [...]]]></description>
			<content:encoded><![CDATA[<p>I recently came across a very interesting problem known as &#8220;The Monty Hall Problem.&#8221; This is a statistical puzzle named after the host of an old television show &#8220;Let&#8217;s Make a Deal&#8221; which featured a similar problem albeit a little more involved than the basic version that mathematicians use. Here is a simple description of the problem <a title="Wikipedia - Monty Hall problem" href="http://en.wikipedia.org/wiki/Monty_Hall_problem">from Wikipedia</a>:</p>
<blockquote><p>Suppose you&#8217;re on a game show, and you&#8217;re given the choice of three doors: Behind one door is a car; behind the others, goats. You pick a door, say No. 1, and the host, who knows what&#8217;s behind the doors, opens another door, say No. 3, which has a goat. He then says to you, &#8220;Do you want to pick door No. 2?&#8221; Is it to your advantage to switch your choice?</p></blockquote>
<p><span id="more-45"></span>If you have not come across this problem before, intuition would make you say &#8220;it just won&#8217;t matter&#8221; (if not, either you are <em>really</em> clever or your intuition is screwed up; I&#8217;d put my money on the later.) It <em>does</em> matter. But I&#8217;m not going to tell you which is a better strategy (to switch or not to switch). New York Times has a good <a title="New York Times - The Monty Hall Problem" href="http://www.nytimes.com/2008/04/08/science/08monty.html">simulation of the game</a>, go play the game yourself a few times with different strategies and I assure you that you&#8217;ll be stunned. After you&#8217;ve played it a few times, click on &#8220;How it works&#8221; to get a decent understanding of why the best strategy is what it is. If you want a better explanation, both in words and mathematical (using Bayesian analysis), refer to the Wikipedia page linked above.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.grok.in/blog/2008/04/21/to-switch-or-not-to-switch-that-is-the-question/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
	</channel>
</rss>
