Varian re your link: cep.lse.ac.uk/pubs/download/dp1480.pdf
There are a number of flags given in the CEP paper on Brexit Leave voters these are:
"We stress that whilst our paper focuses on the variation of vote shares across local authority areas with respect to key variables such as immigration and education, we have less to say about the overall level of support for Vote Leave. Put differently, our paper focuses on slope coefficients, not intercepts. This is important because in order to get a sense of the absolute number of people who voted for or against Brexit, one would need to refer to data on individuals and how they voted. To some extent, such information is available through polling data, for instance as provided by Ashcroft (2016). Such polls indicate that the typical Leave voter is white, middle class and lives in the South of England. The proportion of Leave voters that are in the lowest two social classes (D and E) is less than one-third (see Dorling, 2016).
We also carry out a back-of-the-envelope calculation regarding turnout. Young vot- ers voted overwhelmingly in favour of Remain but had a lower turnout than older age groups. We find that a higher turnout of young voters would have been very unlikely to result in a different referendum outcome, partly because their turnout was already elevated compared to previous UK-wide elections.
Lastly, we also explore the role of some short-run factors such as heavy rainfall and flooding on the referendum day as well as train cancellations in the South East of England. While we document that these did have a reducing effect on turnout, the reduction does not seem to have affected the overall result: the Remain campaign would still have lost on a sunny day."
In my view, before taking any information from any research paper, one must know exactly, what the variables are, what methods of statistical analysis sed chosen and why. What was not included is just as important as what is included, in this case the self imposed parameters of the statistical analysis and limits of the research are I believe significant, although I am not a statistician far from it. They say:
"... key variables such as immigration and education, we have less to say about the overall level of support for Vote Leave. Put differently, our paper focuses on slope coefficients, not intercepts. This is important because in order to get a sense of the absolute number of people who voted for or against Brexit, one would need to refer to data on individuals and how they voted."
The research does not cover "absolute number of people who voted for or against Brexit". Hence the choice of the slope coefficients model rather than intercepts.
This tells you that the key variants chosen by them, are education and immigration, in another research analysis other key variants might have been chosen. So this research specifically looked at those two variants rather than other variants that I for one would consider key.
The statistical methods chosen are also significant, and they specifically mention this, to alert the reader to the inadequacy of these methods when analysing individual data which would be necessary to "refer to data on individuals and how they voted" which of course this study did not cover.
They also refer to Ashcroft (2016) regarding "some" information on polling data:
"To some extent, such information is available through polling data, for instance as provided by Ashcroft (2016)."
Interestingly Ashcroft (2016) data does not support the 'typical' uneducated manual unemployed worker on a low wage theory put forward in a different context. Ashcroft as they say states:
" Such polls indicate that the typical Leave voter is white, middle class and lives in the South of England. The proportion of Leave voters that are in the lowest two social classes (D and E) is less than one-third (see Dorling, 2016)."
They were also clear that little rigorous analysis had been carried out regarding turnout when they say:
"We also carry out a back-of-the-envelope calculation regarding turnout."
Now whilst the CEP is a highly regarded academic unit attached to an academic body, it is essential that their terms of reference, their chosen variables, and their methods of statistical analysis be considered when reading their findings. Since it indicates and they are, as one would expect, open about their purpose, and how they arrive at their conclusions, as well as indicating where their study did not go, or where it is less rigorous. This is entirely proper. What is important is that quoting out of context can misrepresent the findings, so I urge anyone interested to click on the link and read the whole thing for yourselves.