# Chocolates, bugs, and socks

## How many chocolates has Lindsay eaten?

Every chocolate Lindsay eats, she gets a picture of a bug, randomly selected from a set of 20 bug pictures. She just collected her 19-th unique bug picture. How many chocolates do you think she's eaten?

When she's collected 7 unique bugs, each chocolate brings with it a $\frac{13}{20}$ chance of getting an 8-th bug. How many chocolates does she have to eat before that happens, and with what variance? Well, for any event with probability $p$ of success, the expected number of trials before the first success is $\langle n\rangle =\sum_{n=1}^\infty n(1-p)^{n-1}p = \frac 1p$, and the expected square of the number of trials before success is $\langle n^2\rangle =\sum_{n=1}^\infty n^2(1-p)^{n-1}p = \frac{2-p}{p^2}$, so the variance is $\langle n^2\rangle - \langle n\rangle^2 = \frac 1p - 1$. So the expected number of chocolates Lindsay ate is $$\frac{20}{20}+\frac{20}{19}+\cdots+\frac{20}{2} \approx 20\times(\ln(20) + \frac 12 - 1) \approx 50.$$ (This was my back-of-the-envelope guess using that $\ln(20)\approx 3$, and that the Euler-Mascheroni constant is about 0.5. The actual value is very close to 52.) The number of chocolates eaten to get from 5 to 6 bugs is independent of the number of chocolates eaten to get from 6 to 7 bugs, so the variances add too: $$\Bigl(\frac{20}{20} - 1\Bigr) + \Bigl(\frac{20}{19} - 1\Bigr) + \cdots \Bigl(\frac{20}{2} - 1\Bigr) \approx 33.$$ (Same as previous number, minus 19.) So the standard deviation is $\sqrt{33}\approx 5.74$. Assuming this sum of random variables is roughly Guassian (how good is this assumption?), the 90% confidence interval (z-score 1.645) for how many chocolates Lindsay has eaten is $52\pm 9.4$.

Note: she should expect to eat another 20 chocolates before getting that last bug.

## How many unique bugs are there?

Suppose Lindsay knows she's eaten 50 chocolates, among which there were 19 unique bugs, but she doesn't know how many unique bugs there are out there. If she's eaten n chocolates and found k unique bugs, and there are b unique bugs, the likelihood is $$\binom{b}{k}k^{n-k}b^{-n}$$ (pick the subset of k, which also uses up k of your chocolates, then distribute the remaining n-k chocolates among those k bugs). For n=50 and k=19, the maximally likely answer is that there are 20 bugs.

## How many socks do I have?

I've just pulled 10 socks out of a pile of laundry, making 2 pairs and 6 singletons. How many pairs of socks are there total (assuming each sock has a unique partner)?

If there are $n$ pairs of socks, I pull out $k$, and make $p$ pairs, the likelihood is $$\binom np \binom{n-p}{k-2p} \binom{2n}{k}^{-1}$$ (first pick the $p$ pairs, then $k-2p$ singletons). Maximum likelihood answer to the original question: 11 pairs (so $22-10=12$ socks left). Of course, you'd usually have a pretty good prior multiplying this likelihood function.