Coin flips and names (Evil problems in probability continued)

In my post about the girl-named-Florida problem, there is a factor in the analysis looking at the probability of having a girl named Florida given that you have two girls: P(F|2g).

This term is easily calculated as


which I used in the analysis.

Someone raised the question, “What would happen if (as we know) people don’t tend to name two children the same (unless you’re George Foreman)?” At first, this seems exactly like a coin flip problem: what is the probability of, in two coin flips, flipping heads on the first flip or flipping heads on the second but not both? It turns out that this is a different problem, and the result is surprising (at least to me). We have to be very careful what information we condition on, knowing that the English language is a little more fluid than we like when dealing with such problems. In the coin flip case we define



and it follows, given the probability of flipping heads is h,


which is just the standard result, subtracting off the possibility of having both heads. For h=0.5, this yields the standard result of P(h) = 0.5. As h gets close to 1, the probability of a heads goes way up, and thus the probability of both being heads goes way up. As a result, the probability of just having 1 heads goes to zero.

The situation with names is nearly the opposite: as the frequency of a name increases, the name is much more common. This makes it more and more likely that you will have someone with that name. The difference is in the conditioning information:


The analysis then goes:


which is exactly the same result as the case where one can name both of the children Florida! I was a little surprised by this result, but a quick simulation confirmed it as well.


from pylab import *
from numpy import *


N1=list(r1&lt f)
N2=list(r2&lt f)

case1=[n1 or n2 for n1,n2 in zip(N1,N2)]

print "Fraction allowing duplicate names: ",case1.count(True)/float(len(case1))
print "Theoretical Value: ",f+f-f**2

for n1,n2 in zip(N1,N2):
if n1:

case2=[n1 or n2 for n1,n2 in zip(N1,N2)]

print "Fraction not allowing duplicate names: ",case2.count(True)/float(len(case2))

Simulation Result

Fraction allowing duplicate names: 0.1853
Theoretical Value: 0.19
Fraction not allowing duplicate names: 0.1853


About brianblais

I am a professor of Science and Technology at Bryant University in Smithfield, RI, and a research professor in the Institute for Brain and Neural Systems, Brown University. My research is in computational neuroscience and statistics. I teach physics, meteorology, astonomy, theoretical neuroscience, systems dynamics, artificial intelligence and robotics. My book, "Theory of Cortical Plasticity" (World Scientific, 2004), details a theory of learning and memory in the cortex, and presents the consequences and predictions of the theory. I am an avid python enthusiast, and a Bayesian (a la E. T. Jaynes), and love music.
This entry was posted in Uncategorized. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s