Chocolates, bugs, and socks: Difference between revisions

From stacky wiki
(Created page with "Every chocolate Lindsay eats, she gets a picture of a bug, randomly selected from a set of 20 bug pictures. She has collected 19 of the bug pictures. How many chocolates has ...")
 
No edit summary
 
(15 intermediate revisions by the same user not shown)
Line 1: Line 1:
Every chocolate Lindsay eats, she gets a picture of a bug, randomly selected from a set of 20 bug pictures. She has collected 19 of the bug pictures.
[[Category:Blog]]
==How many chocolates has Lindsay eaten?==


How many chocolates has she probably eaten?
Every chocolate Lindsay eats, she gets a picture of a bug, randomly selected from a set of 20 bug pictures. She just collected her 19-th unique bug picture. How many chocolates do you think she's eaten?
$$\frac{20}{20}+\frac{20}{19}+\cdots+\frac{20}{2} \approx 20\times(\ln(20) + \frac 12 - 1) \approx 50$$
She should expect to eat another 20 chocolates before getting that last bug.


What's the 90% confidence interval around that? For a rough answer, we need to know the variance in the number of trials needed before a success for a probability $p$ event. If that's $v(p)$, then the variance for our number is $v(\frac{20}{20})+v(\frac{20}{19})+\cdots+v(\frac{20}{2})$. You can compute $v(p)$ because we know the distribution for number of trials before success is an exponential. Do that. Also, what about getting the exact 90% CI?
When she's collected 7 unique bugs, each chocolate brings with it a $\frac{13}{20}$ chance of getting an 8-th bug. How many chocolates does she have to eat before that happens, and with what variance? Well, for any event with probability $p$ of success, the expected number of trials before the first success is $\langle n\rangle =\sum_{n=1}^\infty n(1-p)^{n-1}p = \frac 1p$, and the expected square of the number of trials before success is $\langle n^2\rangle =\sum_{n=1}^\infty n^2(1-p)^{n-1}p = \frac{2-p}{p^2}$, so the variance is $\langle n^2\rangle - \langle n\rangle^2 = \frac 1p - 1$. So the expected number of chocolates Lindsay ate is
$$\frac{20}{20}+\frac{20}{19}+\cdots+\frac{20}{2} \approx 20\times(\ln(20) + \frac 12 - 1) \approx 50.$$
(This was my back-of-the-envelope guess using that $\ln(20)\approx 3$, and that the Euler-Mascheroni constant is about 0.5. The actual value is very close to 52.) The number of chocolates eaten to get from 5 to 6 bugs is independent of the number of chocolates eaten to get from 6 to 7 bugs, so the variances add too:
$$\Bigl(\frac{20}{20} - 1\Bigr) + \Bigl(\frac{20}{19} - 1\Bigr) + \cdots \Bigl(\frac{20}{2} - 1\Bigr) \approx 33.$$
(Same as previous number, minus 19.) So the standard deviation is $\sqrt{33}\approx 5.74$. Assuming this sum of random variables is roughly Guassian (how good is this assumption?), the 90% confidence interval (z-score 1.645) for how many chocolates Lindsay has eaten is $52\pm 9.4$.


----
Note: she should expect to eat another 20 chocolates before getting that last bug.


I've just pulled 20 socks out of a pile of clean laundry one at a time. If I've made 7 pairs and have 6 singletons, how many socks do I expect are in the pile (assuming each sock has a unique partner)? I can compute the likelihood function. If there are $n$ pairs of socks, I pull out $k$, and make $p$ pairs, the likelihood is
==How many unique bugs are there?==
Suppose Lindsay '''knows''' she's eaten 50 chocolates, among which there were 19 unique bugs, but she doesn't know how many unique bugs there are out there. If she's eaten n chocolates and found k unique bugs, and there are b unique bugs, the likelihood is
$$
\binom{b}{k}k^{n-k}b^{-n}
$$
(pick the subset of k, which also uses up k of your chocolates, then distribute the remaining n-k chocolates among those k bugs). For n=50 and k=19, the maximally likely answer is that there are [http://www.wolframalpha.com/input/?i=plot+%28b+choose+k%29+*+k%5E%28n-k%29%2F%28b%5En%29+for+n%3D50%2C+k%3D19%2C+b%3D19+to+22 20 bugs].
 
==How many socks do I have?==
 
I've just pulled 10 socks out of a pile of laundry, making 2 pairs and 6 singletons. How many pairs of socks are there total (assuming each sock has a unique partner)?
 
If there are $n$ pairs of socks, I pull out $k$, and make $p$ pairs, the likelihood is
$$\binom np \binom{n-p}{k-2p} \binom{2n}{k}^{-1}$$
$$\binom np \binom{n-p}{k-2p} \binom{2n}{k}^{-1}$$
(first pick the $p$ pairs, then $k-2p$ singletons)
(first pick the $p$ pairs, then $k-2p$ singletons). Maximum likelihood answer to the original question: [http://www.wolframalpha.com/input/?i=plot+%28n+choose+p%29+*+%28%28n-p%29+choose+k-2*p%29+%2F+%282*n+choose+k%29+for+k%3D10%2C+p%3D2%2C+n%3D7+to+15 11 pairs] (so $22-10=12$ socks left). Of course, you'd usually have a pretty good prior multiplying this likelihood function.

Latest revision as of 08:32, 7 March 2016

How many chocolates has Lindsay eaten?

Every chocolate Lindsay eats, she gets a picture of a bug, randomly selected from a set of 20 bug pictures. She just collected her 19-th unique bug picture. How many chocolates do you think she's eaten?

When she's collected 7 unique bugs, each chocolate brings with it a $\frac{13}{20}$ chance of getting an 8-th bug. How many chocolates does she have to eat before that happens, and with what variance? Well, for any event with probability $p$ of success, the expected number of trials before the first success is $\langle n\rangle =\sum_{n=1}^\infty n(1-p)^{n-1}p = \frac 1p$, and the expected square of the number of trials before success is $\langle n^2\rangle =\sum_{n=1}^\infty n^2(1-p)^{n-1}p = \frac{2-p}{p^2}$, so the variance is $\langle n^2\rangle - \langle n\rangle^2 = \frac 1p - 1$. So the expected number of chocolates Lindsay ate is $$\frac{20}{20}+\frac{20}{19}+\cdots+\frac{20}{2} \approx 20\times(\ln(20) + \frac 12 - 1) \approx 50.$$ (This was my back-of-the-envelope guess using that $\ln(20)\approx 3$, and that the Euler-Mascheroni constant is about 0.5. The actual value is very close to 52.) The number of chocolates eaten to get from 5 to 6 bugs is independent of the number of chocolates eaten to get from 6 to 7 bugs, so the variances add too: $$\Bigl(\frac{20}{20} - 1\Bigr) + \Bigl(\frac{20}{19} - 1\Bigr) + \cdots \Bigl(\frac{20}{2} - 1\Bigr) \approx 33.$$ (Same as previous number, minus 19.) So the standard deviation is $\sqrt{33}\approx 5.74$. Assuming this sum of random variables is roughly Guassian (how good is this assumption?), the 90% confidence interval (z-score 1.645) for how many chocolates Lindsay has eaten is $52\pm 9.4$.

Note: she should expect to eat another 20 chocolates before getting that last bug.

How many unique bugs are there?

Suppose Lindsay knows she's eaten 50 chocolates, among which there were 19 unique bugs, but she doesn't know how many unique bugs there are out there. If she's eaten n chocolates and found k unique bugs, and there are b unique bugs, the likelihood is $$ \binom{b}{k}k^{n-k}b^{-n} $$ (pick the subset of k, which also uses up k of your chocolates, then distribute the remaining n-k chocolates among those k bugs). For n=50 and k=19, the maximally likely answer is that there are 20 bugs.

How many socks do I have?

I've just pulled 10 socks out of a pile of laundry, making 2 pairs and 6 singletons. How many pairs of socks are there total (assuming each sock has a unique partner)?

If there are $n$ pairs of socks, I pull out $k$, and make $p$ pairs, the likelihood is $$\binom np \binom{n-p}{k-2p} \binom{2n}{k}^{-1}$$ (first pick the $p$ pairs, then $k-2p$ singletons). Maximum likelihood answer to the original question: 11 pairs (so $22-10=12$ socks left). Of course, you'd usually have a pretty good prior multiplying this likelihood function.