That makes sense to me, but I’m also looking at Jayne’s in a more practical sense for social science data analysis.

I’m working through a hard copy of your book and I stumbled upon Gelman’s criticism that Jayne’s for more messy datasets (people vs physics) without objective parameters might not work well.

Gelman seems to take issue with Jayne’s in that:

1. The prior or model cannot be falsified and there isn’t enough model checking. He is more falsificationist

2. Gelman has a more frequentist definition of probability

3. Cannot change, adjust or add new priors.

4. Often impossible to know true prior…then is it ok to have a subjective one?

5. All models are wrong in social science, but some are less wrong or more useful, so important to cycle through models.

Jayne’s seems to be purist and justified, whereas gelman has pragmatic approach from practice and incorporation of other philosophies.

Since you have cited Gelman’s book, what are your thoughts on these points? The practical implication is that I’m going to be analyzing social science data, and Gelman per those dimensions seems to make more sense, but Jayne’s is more philosophically grounded.

Is your book Pure Jayne’s or would it work for a social scientist? I’m particularly interested in the idea of subjective priors incorporating theory or other non statistical experiments like ABM, falsifying and cycling through models etc

Thanks!

]]>what do you think?

]]>https://meaningness.com/probability-and-logic

http://www.stat.columbia.edu/~gelman/research/published/philosophy.pdf ]]>

Now, imagine you are looking at a similar report, where there is a ‘y’ if there is at least one girl. This is what is required to get the answer 1/3 for the version of the question without a name being mentioned.The assumptions required aren’t quite as extreme, but are still not implied by the question.

Now imagine this: You were looking at a data base of families. You recall several that caught your eye because of an unusual name. Half were boy’s names,and half were girl’s names. But the only name you specifically recall was “Florida.” Isn’t this more reasonable? This makes the answer exactly 1/2, because you must consider it equally likely that you would have noticed Florida’s brother Tex first,and recalled that.

The issue can be boiled down further than what you did. Did you notice this family because you first picked the name ‘Florida’ to be a name of interest, and sought examples? Or did the name merely catch your eye? In the first case, your conditional sample will include *ALL* families with a ‘Florida.’ In the second, it must exclude families with a ‘Florida’, but where you would have noticed her sister ‘Georgia,’ or her bother ‘Indiana’ instead. The answer is (2-f)/(4-f) in the first case, and 1?2 in the second.

Or in the simpler problem (without names), did you choose to look at only families that have a girl, or did the fact that a random family has a girl catch your eye? The answer is 1/3 in the first case, and 1/2 in the second. And one of the great things about preferring the second option, is that – as expected – the answer doesn’t change if also know the girl is a red-headed, left-handed, fan of the band U2.

I suggest you look up Bertrand’s Box Paradox, change “silver” to “bronze,” and add the obvious fourth box to complete the analogy to the two child problem. If the answer to the simpler problem, asking for the probability that both (metals, genders) match, is 1/3 when you know there is a (gold coin, girl)? Then it is also 1/3 when you know there is a (bronze coin, boy). If it is 1/3 regardless of any one (metal, gender) you know, it is 1/3 even if you don’t know a (metal, gender). But we know the answer is 1/2.

]]>” I’m looking in a report with statistical data for familes with two children. The data is not sorted in any way. The pages are filled with last names (not relevant for the riddle) and a ‘y’ if there is at least one girl with the name Florida, and a ‘n’ if there is no girl by that name. I chose an entry randomly and I see a ‘y’

I know Florida is a very rare name. What are the odds the other child is also a girl.”

There is more chance on a Florida in a two-girl family than in a one-girl family and if i see a ‘y’ the chance I’m dealing with a 2-girl family is larger.

The same can’t be said of the original problem (where it is known that at least one is a girl).

So one cannot say :”There is more chance on a girl in a two-girl family than in a one-girl family and if i see a ‘y’ the chance I’m dealing with a 2-girl family is larger” This would be nonsense to say this.

A) Be removed from your probability analysis because you would have noticed both?

B) Be included as a 50% conditional probability because you’d notice “Tex” half of the time when both appear?

C) Be treated as a 100% conditional probability because you’d always notice “Florida” when both appear?

D) Something else?

I agree that the problem does not provide enough information to support a definitive answer. Any answer must be based on some set of assumptions that fill in the blanks about why you recall the fact that you recall. But the only *reasonable* assumption is that you are equally likely to recall a boy, or a girl, in a family that has both; when considered over all possibilities for why you recall only one. AND THIS MAKES THE ANSWER TO BOTH PROBLEM #1, and #2, be 1/2.

The controversy over this family of problems occurs because the *event* “you know that there is at least on girl with (insert possible other information here)” is not the same thing as the *fact* “there is at least one girl with (insert possible other information here).” There are two children in this family, so most of the time there is another *fact* that could be the *event* you recall. This difference was brought to the attention of the world first by Joseph Bertrand in 1889, in what he called the Box Paradox. It was a cautionary tale, warning people to not confuse facts with events. And it is still being ignored. To make it more appropriate here, I’ll apply the paradox to the variations of Problem #2 above:

In 2.1, Brian correctly says that the probability that both cards are hearts is 12/56=3/14. The probability that both are Spades is also 3/14. So the probability that both are the same suit is 2/14+3/14=3/7. (You could also get this result by asking if the second card’s suit matches the first card’s.)

In 2.2, you are told that at least one card is a Heart. Brian claims that the probability that both are Hearts, given this information, is 12/44=3/11. We can only assume that he would say the probability of two Spades, if you are told that at least one is a Spade, is also 3/11.

But what if the observer, instead of telling you the suit (s)he observed, writes it down without showing you? If (s)he wrote “Hearts,” the probability of matching suits seems to be 3/11. If (s)he wrote “Spades,” the probability of matching suits seems also to be 3/11. So irrespective of what was written, the chances of matching suits seems to have changed from the 3/7 found in 2.1, to the 3/11 found in 2.2. Yet we have not gained any actual information about the cards! So how can this change occur?

The answer is, it can’t. The *event* where this observer tells you a suit, is not the same as the *fact* applying to the pair of cards. When the cards are the Ace of Hearts and the Deuce of Spades, the observer can tell you either “there is a at least one Heart” or “There is at least one Spade.” Not knowing how (s)he would choose, we can only assume it is random.

This changes the denominator of the equation Brian used to solve problem 2.2. Using “O” to indicate the *event* where a suit is told to you, It should be:

Pr(OH|H1,H2)*Pr(H1,H2) + Pr(OH|H1,S2)*Pr(H1,S2) + Pr(OH|S1,H2)*Pr(S1,H2) + Pr(OH|S1,S2)*Pr(S1,S2)

= (1)*(4/8)*(3/7) + (1/2)*(4/8)*(4/7) + (1/2)*(4/8)*(4/7) + (0)*(4/8)*(3/7)

= 1/2.

(If you doubt this, please check out any textbook’s definition of Bayes’ Theorem.)

Note that, to be rigorously complete, I included the 0% chance that the observer would say “Heart” when there were two Spades. It is not surprising that this result is 1/2, since it represents the probability the observer would say “Heart” when we have no information. It is also not surprising that using this denominator in the solution to 2.1 makes the answer 3/7, resolving the paradox found above.

]]>Finally, I am not updating this wordpress blog, except to point to my main website at: http://web.bryant.edu/~bblais, just in case you’re interested to follow over there.

]]>” I’m looking in a report with statistical data for familes with two children. The data is not sorted in any way. I have chosen an entry randomly and I see one of the children is named ‘Florida’.

I know Florida is a very rare name. What are the odds the other child is also a girl.”

Now it is obvious that you do have to include the chance that the first girl is called Florida into the equation.

The problem with the orginal riddle (“Say you know a familly has…”) is that it could the scenario that I have just proposed, OR:

it could be I have just made up a riddle for fun and i incorporated an uncommon name in it. Or I just looked in a book with uncommon names before making this up this riddle, and one name was still in my mind: ‘Florida’. Obviously i was biased.

In the latter case knowing that one girl’s name is Florida doesn’t change the odds in comparison to the chance only knowing that one of the two is a girl (odds are 1/3)

So my conclusion is that the riddle in itself is poorly stated and has two outcomes.

]]>