Wednesday, October 12, 2016

Calamitous poll should be seen as political entertainment not a prediction of the future

Calamitous, was how Owen Jones described the latest poll from ICM which gives the Conservative Party a 17 point lead over Labour.

This poll followed hot on the heels of a YouGov poll which suggested that the Tories had a 9-point lead over Labour. We need to do something and fast was the message from Owen Jones and members of the Parliamentary Labour Party. Did they over-react?

The two latest polls from ICM and YouGov are examples of why we should treat opinion polls as no more than a bit of political entertainment. Perhaps worthy of a headline, but in reality less capable of predicting the future than opinion pollsters would have us believe.
ICM claims a headline figure of 43% for the Conservatives with  Labour trailing on 26%, a Conservative lead of 17%, leading to headlines such as “The Conservatives' lead over Labour has widened to 17% — the second highest ever recorded” (http://uk.businessinsider.com/icm-poll-tory-lead-widens-to-17-over-labour-2016-10).

What this poll suggests is that, when asked, 43% of voters said they would vote Tory and only 26% said Labour. This poll followed a recent YouGov poll which suggested 39% of voters would vote Tory, compared to 30% for Labour. Calamitous indeed! If true. Fortunately, this is far from being the state of affairs at the moment. Labour is trailing with the public, but quite possibly not by the amount being suggested.

In fact, when “asked” by YouGov 27%, not the 39% reported, answered Conservative with 21% supporting Labour. The same question a couple of weeks later by ICM had 30% answering Conservative, not the 43% which prompted the headlines and 20% Labour. So, we might ask: how do they get these headline figures? The answer in a word: weighting.

The Independent with headline “Conservatives open up 17-point lead over Labour, according to new poll” (http://www.independent.co.uk/news/uk/politics/tory-labour-poll-conservatives-lead-17-points-theresa-may-jeremy-corbyn-icm-latest-a7354091.html ) does include the following explanation, from ICM’s Martin Boon, of why it could have been even worse for Labour: “Labour's share has only been saved from a record low by ICM's standard post-fieldwork adjustment techniques, which ordinarily help the Tories.”

Wow, thank goodness for those “standard post-fieldwork adjustment techniques” otherwise it would have been toast for Labour! The phrase used makes this sound very scientific. In fact, whilst these kind of weighting techniques are used to try to make unrepresentative samples more representative the 2015 Polling Report commissioned by the Polling Companies reached the following conclusion: “Our conclusion is that the primary cause of the polling miss in 2015 was unrepresentative samples. .. The statistical adjustment procedures applied to the raw data did not mitigate this basic problem to any notable degree.”

Historically, pollsters have tended to over-estimate Labour support. This is seen as a problem with their statistical models rather than a failure to get a representative sample in the first place. Given the polling companies failure to predict the 2015 General Election or the EU Referendum the statisticians have not simply sat back and allowed their models to continue being wrong. There is no doubt that they have been adjusting and re-adjusting their models. 

The problem is that until there is a General Election, which seems highly unlikely until after Brexit, that neither they nor we will know how effective their modelling has been. Based on the past, however, it is difficult not to feel that having plugged one gap in their model they will not have simply opened up another. If they get an election spot-on, that will be more to do with pure chance than the accuracy of their model, which will with alarming predictability be proved to be wrong in the next election. And, given that they know that they have been over-counting Labour, the answer surely, from their perspective, would be to adjust downwards for Labour and upwards for the Conservatives?

I happen to be of the opinion that unless there are absolutely compelling reasons to do otherwise that the results of surveys should be given based on the raw figures, and that the margin of error should be included to give a more accurate picture to those trying to interpret those figures. We often hear the margin of error reported as plus or minus 3%. Actually, more often we don’t hear this anymore, so that the headline poll figure is treated as the ‘real’ figure. What margin of error alerts us to is that unless you ask every single person in a population what they think about x then you have to accept that your  sample may not be representative. Assuming that the sample is random, then you have a known statistical chance of being within 3 percentage points of the correct answer. So if your sample suggests that 30% of the sample are going to vote Conservative, then the true figure is somewhere between 27 and 33%. The results should be, but never are, reported as a range any point of which could be correct. Neither, I should add, are most polling samples random.

These are the raw figures (and the margin of error range)  for the 2 recent polls:

YouGov Poll 28-29 Sept
Range (+/- 3%)
ICM Poll 7-9 Oct
Range (+/- 3%)
Conservative
27
24-30
30
27-33
Labour
21
18-24
20
17-23
Liberal Democrat
6
3-9
6
3-9
UKIP
9
6-12
9
6-12
SNP/Plaid Cymru
4
1-7
4
1-7
Other
3
1-6
1
1-4
Would not vote
13
10-16
4
1-7
Don’t know/Undecided
18
15-21
22
19-25

There is a lot going on here, but what is worthy of note are the don’t knows and would not votes. We know, because we have the data, that turnout in General Elections has been less than 70% with the exception of 1992 and 1987 when it reached 78%. In 2015 only 66% of registered voters actually did so. However, one of the weighting techniques is to use data from those who say they don’t know and assign them to a party based on  the way they voted in 2015. But only 11% of the sample said they did not vote, another 4% would not or could not say how they voted. If that figure were true the turnout in 2015 would have been 85%. In other words, the weighting is based on either an over-estimation of those who say they voted (but did not) or on a sample which is disproportionately made up of people who do vote. If the latter, then the rationale behind the weighting is actually quite suspect. This is compounded somewhat by the 4% who say they would not vote in an election, which is way lower than the actual proportion of the electorate who do not vote in practice.

There are some complex statistical procedures at work in opinion polling and I am not suggesting that opinion pollsters are deliberately setting out to mislead the public. What I am suggesting is that, like economic forecasters, they cannot control every variable. They therefore use complex models to try to predict as best they can. That these models are occasionally right, does not make them anywhere near fullproof. Indeed, the fact that they are rarely correct two elections running is pretty sure proof that they are anything but foolproof.


Opinion polls should not be treated as facts. They are, at best, reasonable guesses of the way an election will turn out. At worst, they are complete misrepresentations and are more likely to be used to sway public opinion as to reflect it. I am not arguing that opinion polls, per se, are of no use, but that the headline figures, whilst entertaining, are not a true indication of the state of play of the parties at any time, but rather a set of figures based on some suspect, and constantly evolving statistical trickery. Now, no doubt, were a poll to be published tomorrow that gave Labour a 10% lead I would be as pleased as any other Labour supporter. But, would I believe it. It is often said, usually by those apparently doing badly in the polls, that the only polls that really matter are those which decide an election. I wouldn’t go that far, but given the apparent inability of pollsters to predict with any accuracy events in the future we might think that perhaps they would stop pretending that their polls are any more than snapshots and in this case snapshots which have been heavily photoshopped before being released to the public.

Tuesday, August 16, 2016

Why we should not care too much about polls

I started thinking about writing this blog prompted by Owen Jones suggestion that Jeremy Corbyn supporters do not seem to care enough that Labour is not doing well in the polls. It is certainly true that some of the most recent poll results have not made particularly good reading for Labour supporters. However, the fetishisation of opinion polls is a dangerous game, and as somebody with over 20 years experience of teaching research methods I thought I should perhaps share some of my knowledge more widely.
We need to be clear what an opinion poll can and cannot do. We also need to be clear what an opinion poll is actually telling us. Owen Jones is an astute and intelligent journalist, and should know never to take anything at face value. I have often made the point to my students that we tend to be more critical of polls with which we disagree than those that confirm what we already believed. I see this on social media where both sides of the leadership debate quote polls approvingly only if they seem to show that they are right.
It is important to recognise that polls, particularly political polls, are not simply random acts of data, but are conducted to support or deny particular narratives. Poll figures are always a range for reasons I shall explain. Yet, when polls are reported in the press and through the media we are always given just the headline figure. A rather typical recent example was the Daily Mirror's headline Labour sinks 16 points behind in grim new poll. This was from July 27th and the headline figures were that in this poll of 2,012 people, conducted online, when asked "If there were to be a general election tomorrow which party do you think you would vote for? Conservative/Labour/Liberal Democrat/Other?", 43% said Conservative and 27% Labour. The implication is clear. This is bad news for Labour and particularly for Jeremy Corbyn who is clearly to blame for this state of affairs.
The poll was conducted by ICM who have been tending to show a greater lead for the Tories who, you probably do not recall, had Labour ahead consistently in the run up to the General Election and predicted a 35-35 draw for the election itself. In other words, they expected and convinced most of us that we were heading for a hung parliament. More importantly for my argument. The under reporting for the Tories was around 2%, the over reporting for Labour around 5%. It is important to note that other polls were similar so that the actual vote was anywhere from 2-5% different from the polls. Hold that thought.
The problems that polling companies have to overcome are incredibly complex. It starts with the question asked. Respondents are asked to speculate how they would vote in an event they know is not imminent. Most people will not admit that they have no intention of voting, so within the results are a number of respondents (and we can never know how many) that will not vote. We also know that many people do not decide how to vote until very close to the event. In other words, in asking people to think about the General Election we are, currently, asking them to consider an event that could still be 4 years away. Putting the "if it were tomorrow" qualifier is rather like saying "if you suddenly became 20 feet tall, would it make any difference to you?"
The problem of interpreting the question is compounded by the problem of sampling. Although statistically speaking 2,000 people are likely to be representative of the population at large, they are still 2,000 people who are prepared to be asked. Where particular groups, particularly young and ethnic minorities, are under-represented the pollsters use weighting in order to make the results representative. So whilst 18-24 year olds may be around 9% of the population the sample may only manage to find lets say 50 young voters who will talk to them. So to make up for this the weighting will mean that each of them will, effectively be counted four times. Which might not matter if the views of young people are homogenous, but could seriously bias the results. The more important fact is that polls, by definition, do not include the views of those who refuse to be polled (but may vote) and that is problematic. The tendency of polling organisations is to treat non-responders as if they are the same as responders. In the absence of any data, what else can they do? But, it is likely that non-responders to polls are likely to be disproportionately young and, where online polls are concerned, older.
But does this mean that we should ignore the polls? All polls are subject to error. Not simply by asking speculative questions to the wrong people, but the very real statistical margin of error. For most polls this is estimated at +/- 3 per cent. That is to say that if, for example, the Daily Mirror says that Labour has 27% of the vote, the real figure will be somewhere between 24-30%. In fact, the margin of error of +/- 3% is a common figure based on a random sample, but most polls, particularly online ones are far from random, and the true margin of error could be much greater. So the fact that pollsters could be up to 5% out on a general election (the only time opinion polls are actually tested incidentally) should not surprise us.
But, even accepting that the margin is +/- 3% puts a different perspective on polls. Let's be clear here, the Tories have been ahead of Labour for some time in most polls. Occasionally an individual poll will put Labour ahead as happened with YouGov just before the EU Referendum, but looking overall at recent polls asking about voting intentions most polls have the Tories ahead. What is less clear is just how far and how much that should worry those of us who would rather like to see the Tories out of power.
Let's see what the polls are telling us. To do this I will use the excellent UK Polling Report which provides data on all the main polling organisations. If we go back to May, during the Referendum campaign, most polls put Labour somewhere between 2-6% behind the Tories. However, if we apply the +/-3% that changes quite dramatically. If Labour was doing 3% worse and the Tories 3% better than the published figures then the Tory lead could have been somewhere between 8-10%. On the other hand, in the equally plausible possibility that Labour was 3% better than being reported and the Tories 3% worse, Labour could have had a lead of between 1 and 4%. Which one is correct? The simple answer is we don't know.
The media tend to go for the result which backs their particular narrative. In this case, and increasingly since June that narrative has been that Jeremy Corbyn is unelectable. That narrative has been aided by those within his own party arguing that to be the case. Leadership contender Owen Smith (Owen Smith says Jeremy Corbyn's principles are 'just hot air') arguing precisely this during the leadership hustings.
Clearly poll ratings which seem to show Labour support ebbing away will be popular with a media which has decided that Jeremy Corbyn should not be allowed to lead the Labour Party. The reporting of polls should include all the relevant information. For example, in June in the run-up to the Euro Referendum, Labour was gaining on the Tories.  It is worth noting that in the 8 polls published in June that the Tory lead lead ranged from 0-5%. But, given the margin of error Labour could well have been in front by up to 6 percentage points. This would have given a rather different complexion to Labour MPs discontent with Jeremy Corbyn. Unfortunately, many of them are as innumerate as the general population (see What happened when MPs took a maths test).
So, in reply to Owen Jones, and possibly Owen Smith too, should we be worried by Labour's performance in the polls? Well, yes, it is not good to be behind, but we should bear in mind that polls are part of a wider narrative. If 8 polls in the space of 4 weeks can be widely divergent and if the media refuse to explain that poll results are part of a range not an absolute score, we should treat the polls as what they are. A piece of political data that are fun to watch but not worth spending too much time obsessing over.