I read this article in the Week magazine, concerning the upcoming census. I plan to look at statistical sampling later, but I was struck by the following:
Because supporters and opponents tend to break down along partisan lines. Democrats favor sampling because the people who are traditionally hardest to count are the urban poor, minorities, and immigrants, all of whom tend to live in Democratic strongholds and vote Democratic. These groups are often undercounted because they move so frequently and do not trust government employees asking questions. Republicans, by contrast, stress that the Constitution specifies an “actual enumeration” of the population, not an estimate. They also argue that statistical sampling is inferior to counting. “Anyone familiar with public opinion polling can tell you that statistical sampling carries a margin of error,” Republican Reps. Darrell Issa and Patrick McHenry recently wrote. “And error is the enemy of a full and accurate census.”
The notion that a national count is completely error free is ridiculous. I think everyone would agree that if you do a count, that you will not get everyone. It is known that mistakes are made, omissions occur, and that some people actively avoid the census. Because the census avoidance is not random, the omissions are biased in some way. One can argue in which ways the bias points, but the bias is there.
So what is the best plan of action in this case? You want to make an estimate of the number of people in the country. “Estimate” is the correct word, even for an enumeration, given the fact that the enumeration is known to be incomplete. The best thing to do, then, is to have a public and open statistical model of the process of sampling, with independent ways to confirm the validity of the model. If the model is simple, and open, it would be difficult to argue against. Without this approach, a “strict enumeration” is really an unstated statistical model where the assumptions are very difficult to see.