Everybody lies. Or do they?

Everybody lies, according to a recent book by ex-Google data scientist, Seth Stephens-Davidowitz. He presents wide-ranging evidence on the ways in which our apparent deepest thoughts and feelings, that we often keep hidden, are revealed through our Google searches. In doing so he claims to up-end ‘conventional wisdom’ on many sensitive topics such as race, sexual behaviour and your true attitudes to your spouse.

The book outlines the way in which new sources of insight can be captured as a result of the way in which technology is present in ever more intimate aspects of our lives, quietly recording our innermost thoughts (in this case) care of our Google searches. There is certainly much to recommend in the reading of this book, as it offers a compelling account of the value that new data sources can offer.

There is, however, a criticism that needs to be levelled at this book. There is a presumption throughout that ‘everybody lies’ and surveys are therefore pointless. This is a very binary mode of thinking – essentially the suggestion is that if data analytics are good, then self-reporting methods must be bad. He makes it very clear at different points that he is ‘skeptical of survey data’.

Of course, all practitioners need to be able to accept criticism on the chin and the mark of a mature discipline is to be able to identify boundaries and conditions. There are undoubtedly limitations to surveys as a means of obtaining insight about consumers, particularly on sensitive topics – this is well documented and understood by survey practitioners. A number of actions are typically used to manage the way in which survey participants misreport their behaviours on sensitive topics. Michelle Mackie, a Director at Ipsos Mori, and specialist in questionnaire design, suggests the following:

The use of self-completion methodologies –for example, mail, web, interactive voice response, text messaging – to create a private means for responding
The use of a ‘forgiving’ introduction – for example, “Some people use erotic or pornographic material often, while others do this rarely or never.”
Asking the question presupposing the behaviour – for example ‘How many cigarettes do you smoke a day? ‘
Using the Randomized Response Technique – which aims to eliminate or minimise non-response and dishonesty. The technique advises separating the response from the respondent by introducing a controlled measure of chance or uncertainty, which amounts to randomization of the answering process. This protects the identity of the respondents, at the cost of introducing a degree of uncertainty into the responses.
Offering reassurances around confidentiality – both at the start of the survey and at the point you ask the questions
Pre-testing the questions – for example using cognitive interviewing methods to assess the perceived sensitivity of the questions and to explore strategies which could encourage honesty with participants.

Therefore, on the accusation that survey participants are not always straight on sensitive topics then guilty as charged. But let’s be clear – the limitations are recognised, steps are taken to mitigate the effects and ultimately this is a relatively small part of the overall market research industry. It does not invalidate all surveys, most of which are not on sensitive topics.

But the more serious challenge is the way in which the author tends to misrepresent surveys to make his point about the value of data. He discusses the findings of ‘two well-known professors at the University of California’ who poured through survey-based data and reached the conclusion that voters did not care that Barak Obama was black. They had used a survey based proxy to make an assessment of which states were most racist and then used this to assess the impact of racism on the voting behaviour in the 2008 presidential election. They concluded that ‘the data that are available do not suggest that racism played a major role.’ Stephens-Davidowitz challenged their analysis of which states were most racist by examining Google searches for racist references. His work suggested that racism was far more widely prevalent than the survey based proxy data suggested and when modelling outcomes with this new data he found evidence to suggest that racism did, in fact, play a more significant role in the outcome 2008 presidential election.

This looks, on face value at least, like a justifiable challenge to the original research. But one cannot help suspect that the authors of the original paper would have welcomed the alternative proxy measures, not least because they are careful to point out that their ‘conclusions are not definitive’. In my experience, researchers welcome a variety of data sources to help get closer to an in-depth understanding of an issue and are not wedded to any particular tool. The wider point is that whilst Stephens-Davidowitz challenge was a good one, it is hardly evidence of the widespread failure of survey data.

There is a consistent overstating of cases, such as this throughout the book to make his point about the value of data. Which is a shame as this results in some fairly basic errors in the quality of thinking. For example, he often falls into making ‘category errors’ in his analysis. Take the seemingly innocuous example of his grandmother giving him advice on choice of girlfriend. She advised him that he needed a ‘nice girl, not too pretty, smart, good with people and a sense of humour’. He interpreted her advice as reflecting a ‘database’ of relationships that she had ‘uploaded’ over a near century of her life. He considered that she was spotting patterns and predicting how one variable will affect another. In other words, he thought of his grandmother as a data scientist. However, he thought her analysis was dubious, as other data scientists suggested that his grandmother’s analysis of what makes relationships work was not supported by more representative data from an analysis of Facebook data.

But perhaps it was the grandson who was getting it wrong. Just as if you carry a hammer then it is easy to treat everything as if it is a nail, so if you are a data scientist then even your grandmother looks like one. Maybe her mode of operating was not as a quasi-data scientist, using the rules she had developed to determine the optimal outcome, rather she was thinking about the unique qualities of her grandson and working out what he, as an individual, might need to make him happy. He is making a category error – thinking of her as a data scientist (and the generating rules from the data) misrepresents her ‘analysis’ and makes it easy to decide it is wrong. But maybe she is not wrong concerning her n-1 analysis of one Seth Stephens-Davidowitz. It’s a very different type question than the one he assumes.

In another example, he talks about ‘conventional wisdom’ that growing up in difficult circumstances helps foster the necessary drive to reach the top levels of professional basketball playing. His support for this conventional wisdom includes an internet survey that he conducted among the US population. But again, surely a range of other means have already been used to reach the conclusion that they are more likely to come from well-off stable backgrounds. We don’t necessarily need (an albeit elegant) analysis of data to tell us this, a simple digging into the backgrounds of players using simple desk research approaches will quickly lead us to this conclusion. And even if conventional wisdom is wrong, then that does not mean surveys are wrong and that we lie. There is a subtle and probably unintentional blurring of being wrong and lying. Surveys are designed to collect an accurate record of our perceptions (among other things). Just because our perceptions are incorrect does not mean that surveys are problematic.

On another point, he compares the way in which spouses describe their husbands on social media posts with the words they use alongside ‘husband’ when doing searches (as shown below)

Social media posts	Searches
The best	Gay
My best friend	A jerk
Amazing	Amazing
The greatest	Annoying
So cute	mean

He considers that because we tend to see social media posts rather than Google searches we may minimise how many women think their husbands are ‘a jerk’. Again, nothing wrong with this but I am left thinking ‘So what?’ We know that people tend to be more polarised in their opinions on social media. We are also aware that we can hold two apparently contradictory views of an individual at the same time. It is entirely possible that someone’s husband is both their best friend and considered to be a jerk. Indeed, is this not reflecting the complexity of most relationships? A case keeps trying to be made that we constantly disassemble and confabulate. The trouble is that on close inspection, these examples do not quite stack up to support the theme of the book.

Overall Stephens-Davidowitz understands the need for multiple sources which includes survey data. He points out that Facebook employs social psychologists, anthropologists and sociologists to find out what the numbers miss. He says that ‘the solution is not always more Big Data’. He talks about the way that ‘if you test enough things, just by random chance, one of them will be statistically significant.’ Indeed, he makes all the sensible points that one might hope for in terms of the limitations of Big Data and the need for the integration of data sources. He infers that surveys are problematic but he is not really able to make the case and indeed ultimately backs away that conclusion himself.

A sceptic might suggest that a book where the central theme is how wrong we all are and the problems of surveys, might sell better than a somewhat drier read on the need for integrated research methods. There is plenty to recommend this engaging book but proceed with caution as the blurred lines of the argument do not quite stack up.

By Colin Strong

Related