Election 2016: The Battle of the Statisticians...

Macrobius Wednesday 2 Nov 2016

So to make a long story short, I think the >0.1% loss function (L) that Salil Mehta is reporting, is the usual bias in polls + a *difference* between the liberal and conservative polls, because they have different Vu - Vb terms. The easiest way this can happen is for the Liberal polls that Mehta identifies (say) to have more mislabelled or bias cases than the conservative polls. So they end up with different Vu, Vb terms, and this drives a *big* difference in prediction error.

Notice, as I've pointed out, that 0.1% prediction error means your data are useless for making election predictions. You have a 50% chance of mispredicting at least one state. You can't predict anything at that error rate -- it's like a disk drive having one error per word. The average word is garbled. Useless.

Macrobius Thursday 3 Nov 2016

Silver's latest -- he's not going to have much more data than this, and has been steady at his 70-30 call for a while.

The article is worth a read:

http://fivethirtyeight.com/features/election-update-the-how-full-is-this-glass-election/

The election will be won in the following states:

Trump: OH, FL, NC, NV
Clinton's 'firewall': CO, PA, NH, MI, WI, VA

Trump has to pierce what liberals call 'the firewall' in at least one, maybe two places to win. If he loses any of OH, FL, NC he could well lose without improbable wins in among the firewall states.

*That said* look at the difference between 'live poller calls' in MI and WI vs other means of polling. Something is going on with the bias for people who actually answer their phones -- not unlike 1948 and Dewey/Truman of course. It's a tough year to be a pollster because calling the election is *really* about understanding your own polling biases.

Broseph Thursday 3 Nov 2016

You got that right. They're into data, not politics.

Macrobius Sunday 6 Nov 2016

So Sunday's the day we do a 'soft close' on predictions for the election, at least in the sporting sense. This thread is about fixing Silver and we have what is pretty much his final call. I think he has a lot right:

- 60/40 is much closer to Clinton's odds than 85% or whatever his tune was a few weeks ago.

- I think he is correct this election will be all about turnout. Whatever the outcome, Clinton has a bigger org on the ground and more advertising and so *if* those things matter in such a partisan race, and if the events of the last week are not sufficient to discourage the right groups, he is correct it is bad for Trump's chances.

- I think I detect a last minute swing towards Clinton, at least in the press, from the darkest days of last weekend. Both sides are hardening towards a partisan contest.

I have some points of criticism:

- Taleb's criticism that Silver's numbers don't behave like probabilities at all, but are too volatile. My thought is that means Silver's 'weights' for the different polls must be wrong.

- The objections Salil Mehta made and what it means for poll averaging in terms of bias - variance tradeoff and the abnormal, > 0.1% bias of certain liberal polls -- which could confound Silver's methodology.

- my independent comments that a binary 1/0 loss function (and logistic regression) would be a better way to go than Silver's poll averaging anyway.

Finally, there is the problem that

So, putting that all together, I'm *starting* from where Silver is, and if I say anything different in the end, I have to give a reason for it. This certainly biases what I'm going to say -- but I will do my best to point out the implications of my disagreements with Silver at least, in quantitative form, as the week wears on. (And this certainly won't stop on 11/8. There will be exit polls, recounts, faithless electors, a host of claims about numbers to be sifted, as well as post mortem consideration of the election itself!

The Alt Right doesn't have it's own stats guy (if you know of one do link!) - at least there is no clearly identified person from the Alt Right who is a statistician and practicing his craft. Gentlemen, we have a vacancy to fill here.... I'll do my best.

Macrobius Sunday 6 Nov 2016

Silver's 'final call' column - http://fivethirtyeight.com/features...aign-is-almost-over-and-heres-where-we-stand/

Macrobius Thursday 10 Nov 2016

Nate Silver's own post mortem:

http://fivethirtyeight.com/features/what-a-difference-2-percentage-points-makes/

What he said going in

http://fivethirtyeight.com/features...of-outcomes-and-most-of-them-come-up-clinton/

post mortem by Andrew Gelman:

http://andrewgelman.com/2016/11/09/polls-just-fine-blue-states-blew-red-states/
http://andrewgelman.com/2016/11/09/explanations-shocking-2-shift/

The curvature in the last graph of the first article is interesting -- the curvature persists down into states that were Romney Blue, but Trump Red, the so called swing states. This suggests the problem was *not* Shy Tory, since the primary curvature is in Red States, where the Shy Tory effect is weakest. Also, the Redder the state, the stronger the effect. Maybe our Tories aren't 'Shy' -- perhaps they are Belligerent when pollsters call.

More of a Pissed Off Tory effect...

P.S. Two other parts of the story:

– Voter enthusiasm. The claim has been made that Trump’s supporters had more enthusiasm for their candidate. They were part of a movement (as with Obama 2008) in a way that was less so for Clinton’s supporters. That enthusiasm could transfer to unexpectedly high voter turnout, with the twist that this would be hard to capture in pre-election surveys if Trump’s supporters were, at the same time, less likely to respond to pollsters.

– The “ground game” and social media. One reason the election outcome came as a surprise is that we kept hearing stories about Hillary Clinton’s professional campaign and big get-out-the-vote operation, as compared to Donald Trump’s campaign which seemed focused on talk show appearances and twitter. But maybe the Trump’s campaign’s social media efforts were underestimated.

P.P.S. One more thing: I think one reason for the shock is that people are reacting not just to the conditional probability, Pr (Trump wins | Trump reaches Election Day with 48% of two-party support in the polls), but to the unconditional probability, Pr (Trump becomes president of the United States | our state of knowledge two years ago). That unconditional probability is very low. And I think a lot of the stunned reaction is in part that things got so far.

To use a poker analogy: if you’re drawing to an inside straight on the river, the odds are (typically) against you. But the real question is how you got to the final table of the WSOP in the first place.

Reuters post mortem

http://www.zerohedge.com/news/2016-11-10/reuters-explains-how-it-rigged-its-polls

Commentary at ZH:

Direct link: http://mobile.reuters.com/article/i...9552&utm_medium=trueAnthem&utm_source=twitter

There explanation is silly. They are claiming correlations between demographically similar states came as a surprise to statisticians.

Macrobius Thursday 10 Nov 2016

So let's do a *real* analysis. There is a graph from the Gelman link above:

[IMG]

The media (and Gelman) is talking about a "2% bias" (shift up and down). But that's not all of the story is it? What about that tilt.

It is easiest to think of voat counts, sometimes, rather than percentages.

Eyeballing the band from OR to TN we get a 2x2 linear system:

-0.005 = a*T + b*H for T=0.3 and H=0.7

0.05 = a*T + b*H for T=0.6 and H=0.4

So we get, for a line down the middle, something like a=0.123 and b=-0.6

Well, that means Trump voats have to be counted 2:1 to predict the election, across the board!

Let's say that again: the polling models are *so biased* against Trump, that for every extra Trump voat you find you have to weight it 2x for every Clinton voat you find, just to compensate for your 'invalid demographic' model (that's what Reuters statistician was probably talking about, or his dumbed down explanation, before a the Reuters journalist wrote it up to the best of his limited understanding).

What this means is that 'oversampling Democrats' is a dandy way to rig a poll. No one pays attention to your prediction in CA or ND, where the conclusion is wrong, and you only 'shift' things by a few percent in the states that matter for the prediction. And, of course, it allows you to make predictions like 'the Democrats will win with 90% confidence' and demoralise the opposition, without appearing to be the total crook that you are.

Anyway, we can 'unscrew' the data if we rotate the whole thing around OR until TN comes down 4%. Then we get a pretty good prediction. It *still* has a 'maximal axis of confusion' from top to bottom of 6-7% and *that's* the important signal here.

Now let's look at another graph.

[IMG]

Romney 2012 vs Trump 2016. Notice anything? The 'errors' fall in the same places. It looks like the polls basically predicted a rematch of Romney 2012, but *something different happened*. You can unskew the election vs election data (that is, nothing to do with polls) by rotating around OR and bringing TN down 4%.... There is still the same 2:1 weighting of Trump needed to unskew the polls (meaning the polls are no better than just predicting Romeny 2012 without asking anyone questions at all, and they are oversampling exactly the same way).

This means, by the way, that we are dealing with reality, not polls, when we look at this chart. The polls and the 2012 election are in one box of rocks, and the Trump 2016 election is a different kettle of fish -- *turnout changed* and no one told the pollsters. New voaters registered [differentially Trump suporters], moar Republicans turned up, and fewer Dems came out for Hillary.

But now we get to the *really* interesting part. Why did Trump do so poorly in TX, and so much better in MN (even it not well enough to win...). There's a 6% differential as we walk from Texas to Minn!

TX AZ GA NC FL CO NV PA NH MN WI MI ME in the Romney vs Trump data
TZ AZ GA VA CO FL MI NC NH PA ME MN WI in the Polls vs Election 2016 data

Let's call this axis 'Texicity' (distance from TX). TX is the center of bad news for the Alt Right (and Trump), and the news gets better the further you get from Texas... until it gets absolutely awesome by the time you get to the rust belt.

There are a couple possible explanations:

- CA and MA (the left and right coasts) are on top of each other -- the polls are pretty accurate in coastal urban areas. They *suck* in flyover country.

- There is also a N-S gradient of liberalism, with liberals hugging the Canadian border.

- obviously it could be Hispanics. One thing that stands out is that Rural Hispanics in the Southwest, Urban Hispanics, and Migratory Hispanics, are three different populations. The first two are heavily democratic but likely have different tendencies for turnout (and ability to contact them by phone).

*Leaving aside the West coast* (which doesn't get pipeline oil from Texas, and has untypical taxes on gasoline), here is the price of gasoline:

https://www.gasbuddy.com/GasPriceMap?z=4
[IMG]

Green is (relatively) bad news for Trump and the Alt Right, and Orange is good news, east of the Rockies, except for New York and Chicago. PA especially stands out, as do IA, and upstate MI and WI. MN not so much.

(the two states that don't fit the pattern, MO and SC, have recently had extreme and extended racial conflict between blacks and whites. They may be somewhat special).

- Of course the incidence of Hispanics and the availability of cheap gas are not independent factors. The cost of gas as you get further from Texas and Louisiana goes up about 25 cents a gallon. This is the *integrated cost* of shipping it - not unlike the cost of migrating from Texas to the edge of Maine.

- of course, gasoline costs are also correlated with turnout. In particular, Whites pursuing a k strategy and having future time orientation (conservatives) are likely to spend money, all things equal, to drive to the polls. Discouraged r selectionists (liberals) are less likely.

This does suggest a strategy for the Alt Right to supress voating by the opposition: enact punitive and regressive gasoline taxes in your state. This gives your state revenues to be independent of the consolidated government (States Rights) and it discourages migration, and it supresses r selectionists from voating. Awesome! We can always lower state income taxes if we want to encourage working and keep the tax revenue neutral.

Is gasoline a poll tax? Or is it just that migrants are hard to count in polls? Probably not the latter, because the Romney election is just like the polls would have predicted if he had used Trump's polls instead of his own. No 'undercounted migrants' in 2012.

- finally, gasoline prices certainly correlate with White lower class distress, as also heating fuel costs.

Anyway, the tl;dr

- there was a yuuge oversampling of Democrats giving Trump a 2:1 disadvantage in the polls (not a mere 2% bias), that wasn't in the Romney polls.

- Part of this may be his own efforts at registering suporters, the 'group cohesion' is more White areas, and stronger in the North than the South, maybe. The Midwest is a perfect storm.

- It is probably not a 'shy Hillary supporter' effect because the biggest skew is in states where she has fewer voats by far. Likewise black turnout.

- the remaining bias is ordered north to south. Trump is in trouble in the South and his greatest vulnerabilities are TX, GA, AZ. FL and NC are the tipping point. Fortunately, in this election gasoline prices are high in the right places (and TX red enough not to matter).

- 'Texicity' (distance from Texas) is a factor, and migrant Hispanics (illegals?) responding to gas prices *may* be a factor. If TX tips to Latino voaters, demographically, Democrats will have a strategy to undermine the Alt Right.

- High gasoline prices may be an effective 'Wall' containing Hispanics in places there are already numerous.

Macrobius Thursday 10 Nov 2016

Discussion with Mike at the Phora:

My mistake, I was actually reading the CA/MA point to TN as the line.

T is fraction of Trump supporters and H is (1-T) as a (crude) estimate. The coefficients are weights to detrend the sloping points.

I read (T.poll, T.election-T.poll [residual])

MA/CA - (-.005,.35)
TN - (0.05,.05)

Here's an online solver

http://math.bd.psu.edu/~jpp4/finitemath/2x2solver.html

A couple copy/paste errors in writeup but the point is any attempt to detrend will give similar coefficients.

This time I got a=0.146, b=-.086 - the ratio of coefficients will be more stable than the magnitude. Somewhere in the 1.7-2.5 range. Bias at the tipping point (T-0.5 = 0) is the other interesting feature of whatever line you pick.

As I said, I am confused by your calculations, but I don't really see what's so special about the OR-TN line in the first place, which seems an inapt description of the band -- in my eyes, the band is better described as a line starting at roughly the origin (not shown on the graph) . What's much more interesting is that the Trump error is positive for all but seven states -- this means forty-three states are showing a Shy Trump tendency. The other seven states are not that mysterious. TX, NM, NV, and CA are packed with Hispanics who presumably cloaked the Shy Trumper effect; HI's Asian population is demographically aberrant so who knows what they were thinking; MA has so few Trumpers that the effect was buried by the poll's overall margin of error. Only WA presents any sort of puzzle, but even that one could be largely explained by the MOE.

There's nothing magical about the band I picked - a linear regression would settle the matter and not involve anyone's 'eye'.

The butterfly shape in the second graph (heteroskedasticity) is even more interesting. That's because there is no poll data in that graph! Elections count a *lot* more people than polls. If you used Romney 2012 as a predictor (and the polls do no better), you would have to rotate the diagram to get the actual Trump outcome. Meaning Trump did a *real* rotation on Romney space to win.

Two things need 'splainin': (1) reality - why did Trump improve on Romney, (2) polls - why did the polls recapitulate Romney and not catch that fact.

I think I prefer to explain all the data and not just seven points. Probably a linear model is called for, to begin with. What Andrew Gelman and others are saying is that you should shift the whole picture down by 2% and then you will have 25 states to explain not just 7.

I take their point but that does not 'unbias' the residuals -- there's and obvious tilt and butterfly variance (heteroskedasticity), so the model has to be wrong in other (interesting!) ways, not just biased.

Macrobius Friday 11 Nov 2016

Liveblogging the election at Slate... over the course of the evening, the liberals learn their predictions are off (like Romney 2012)

http://www.slate.com/blogs/the_slat...ay_live_blog_results_exit_polls_and_more.html

Macrobius Friday 11 Nov 2016

A Rothschild and a Jew knew

http://andrewgelman.com/2016/11/11/david-rothschild-sharad-goel-called-probabilistically-speaking/

Link to the paper at the link. It was written in early October.

Historically, the MOE is 6-7%. So poll averaging needs to broaden up and not quote such outlandish probabilities. Salil Mehta called this one -- and the popular voat maybe.

Gelman has a nice piece on sophistication

http://andrewgelman.com/2016/11/11/election-surprise-three-ways-thinking-probability/

Background: Hillary Clinton was given a 65% or 80% or 90% chance of winning the electoral college. She lost.

Naive view: The poll-based models and the prediction markets said Clinton would win, and she lost. The models are wrong!

Slightly sophisticated view: The predictions were probabilistic. 1-in-3 events happen a third of the time. 1-in-10 events happen a tenth of the time. Polls have nonsampling error. We know this, and the more thoughtful of the poll aggregators included this in their model, which is why they were giving probabilities in the range 65% to 90%, not, say, 98% or 99%.

More sophisticated view: Yes, the probability statements are not invalidated by the occurrence of a low-probability event. But we can learn from these low-probability outcomes. In the polling example, yes an error of 2% is within what one might expect from nonsampling error in national poll aggregates, but the point is that nonsampling error has a reason: it’s not just random. In this case it seems to have arisen from a combination of differential nonresponse, unexpected changes in turnout, and some sloppy modeling choices. It makes sense to try to understand this, not to just say that random things happen and leave it at that.

One might add: Most sophisticated: See Nassim Taleb's masterwork, Silent Risk

Election 2016: The Battle of the Statisticians...

10 posts