Fake Polls Are A Real Problem

Is Kid Rock leading the U.S. Senate race in Michigan? A story like that is essentially designed to go viral, and that’s exactly what happened when Delphi Analytica released a poll fielded from July 14 to July 18. Republican Kid Rock earned 30 percent to Sen. Debbie Stabenow’s 26 percent. A sitting U.S. senator was losing to a man who sang the lyric, “If I was president of the good ol’ USA, you know I’d turn our churches into strip clubs and watch the whole world pray.”

The result was so amazing that the poll was quickly spread around the political sections of the internet. Websites like Daily Caller, Political Wire and Twitchy all wrote about it. Texas Gov. Greg Abbott tweeted it out. And finally, Kid Rock himself shared an article from Gateway Pundit about the poll.

There was just one problem: Nobody knew if the poll was real. Delphi Analytica’s website came online July 6, mere weeks before the Kid Rock poll was supposedly conducted. The pollster had basically no fingerprint on the web.

Indeed, Delphi Analytica isn’t a polling firm in any traditional sense, and it’s not entirely clear they even conducted the poll as advertised.

The story of Delphi Analytica, its mysterious origins and its Kid Rock poll show that the line between legitimate and illegitimate pollsters is blurring. Much of the polling industry is moving online, where conducting a survey is far less expensive than making thousands of phone calls. But that lower price has also opened up polling to all sorts of new people: Some are seasoned professionals trying an old craft with a new tool or well-informed, well-meaning amateurs trying to break into the industry, but other characters have less noble goals — they’re pranksters seeking attention and scam artists trying to make a quick buck.

If you’re a political observer interested in polls or a journalist who writes about them, you need to be more careful than ever.

Who runs Delphi Analytica?

Whenever a new poll comes out, the first question anyone should ask about it is: Who conducted this survey? Do the people behind it have experience conducting polls? At a minimum, can you find reliable information about their backgrounds?

On Delphi Analytica’s website, it merely says that “Delphi Analytica was founded in 2017 by a group of individuals from diverse political backgrounds, united by their affinity for politics, who wanted to create a grassroots public polling organization.”

This should set off immediate alarm bells. Transparency is one of the central tenets of the polling industry. But no individuals are listed as employees on the Delphi Analytica website. Additionally, the site is registered to Domains By Proxy, LLC, a service that keeps the names of a website’s registrant or registrants hidden.

When I wrote to the contact email address listed on Delphi Analytica’s site, initially a person using the name Jessica Lee responded. (A Google search for “Jessica Lee” and “Delphi Analytica” revealed no hits.) She wouldn’t provide the name of Delphi Analytica’s owner, the names of any of the other people working there, or even her own title. She claimed the owner lived in Ohio and the rest of the team worked in the New York metropolitan area. Since FiveThirtyEight is based in New York City, I asked twice if I could meet with someone located in or near the city. Lee refused the first request and ignored the second.

Over the weekend of Aug. 19, I contacted the Delphi Analytica email account one last time, asking again who worked for the group. I got a response that refused again to provide names and was once again signed Jessica Lee, but this email was sent from a Hotmail account that had the name Stephen Lewis in the handle. Lewis would not provide his title at Delphi Analytica — he claimed that people who work for the site don’t have titles — and said the statement was from Lee, who “usually handles our phone and email inquiries.” Lewis said he helps analyze the polling data. My last email to the Delphi Analytica email account, sent the morning of Aug. 20, was returned to me with a “host unknown” message.

So what’s going on here? In short, I don’t know. We do have some hints, however.

After Delphi Analytica released its Michigan survey (it has released eight polls in all), I received a direct message on Twitter from Michael McDonald, a source I had spoken to before. McDonald follows political betting markets and had previously contacted me about another survey firm, CSP Polling, that he believed was a shell organization started by some people who use PredictIt, a betting market for political propositions. McDonald said that CSP stood for “Cuck Shed Polling.” Like Delphi Analytica, CSP Polling doesn’t list anyone who works there on its website.

McDonald, who most recently worked as the social media manager for The Moving Picture Institute’s “We The Internet” YouTube series and has no official connection to any polling organizations or betting markets, told me that after the 2016 election, some PredictIt users started gathering in a chat room on Discord, a voice and text application often used by gamers, to talk politics and betting. McDonald shared screenshots from that chat room, where a person going by the screen name “Autismo Jones,” who claimed to have started Delphi Analytica, bragged about the publicity the Kid Rock poll was receiving. Jones, apparently reacting to an email I had sent to Delphi, wrote, “we dont need Harry Enten. we got governors tweeting out our polls. we are already famous.” Jones even claimed to have closed the comment section on Delphi Analytica’s post sharing the Kid Rock poll because people were saying the poll wasn’t real. The comment section did, in fact, close, but it has since reopened.

McDonald believes that “Jones” and whoever may have helped him or her did so for two reasons. The first: to gain notoriety and troll the press and political observers. (The message above seems to support that theory.) The second: to move the betting markets. That is, a person can put out a poll and get people to place bets in response to it — in this case, some people may have bet on a Kid Rock win — and the poll’s creators can short that position (bet that the value of the position will go down). In a statement, Lee said Delphi Analytica was not created to move the markets. Still, shares of the stock for Michigan’s 2018 Senate race saw their biggest action of the year by far the day after Delphi Analytica published its survey.

The price for one share — which is equivalent to a bet that Stabenow will be re-elected — fell from 78 cents to as low as 63 cents before finishing the day at 70 cents. (The value of a share on PredictIt is capped at $1.) McDonald argued that the market motivations were likely secondary to the trolling factor, but the mere fact that the markets can be so easily manipulated is worrisome.

I wasn’t able to corroborate McDonald’s story with any sources willing to talk on the record. I can’t be sure if Delphi Analytica was, in fact, set up for laughs or for money. That said, many of the details of McDonald’s story line up with other evidence. For instance, many of the most recent images appearing on the Delphi Analytica website are hosted on the Discord server, which is consistent with McDonald’s story that the people behind Delphi Analytica use the Discord app. Additionally, Stephen Lewis invited me to “join our [D]iscord” if I was “really interested in speaking with” the Delphi Analytica team.

Furthermore, McDonald said that the person going by Autismo Jones on Discord also used the handle @James_st1 on Twitter. In direct messages on Twitter, @James_st1 would not provide his real name and claimed he was not the primary person behind Delphi Analytica, but also said that he helped secure funding for the organization.¹ @James_st1 is the only account following the Delphi Analytica page on Medium.

Whoever is behind Delphi Analytica, the group’s lack of transparency about who they are is its own indictment; it’s hard to trust information when you don’t know who’s providing it, or what their motives or incentives are.²

Was the poll real?

It remains unclear whether the person or persons behind Delphi Analytica conducted a poll. McDonald claims that the people behind the agency merely put a poll from Google Surveys into the field and may not have adjusted the data to ensure a representative sample. When I asked Delphi Analytica directly about their polling process, an email signed by Jessica Lee said that the group used both Google Surveys and SurveyMonkey and adjusted the data, though that was not noted in the original writeup.³ When asked for specifics about the Michigan poll, Lee did not respond. Neither Google Surveys nor SurveyMonkey could confirm whether this survey was done on their platforms.

Still, Delphi Analytica’s polls and the data on its site are somewhat consistent with what you would expect to see if the polling was done on the Google Surveys platform. Google Surveys includes details of each respondent’s age bracket, gender, and city and state in any given poll. (You can see an example of the kind of output these polls produce if you download the data for this poll that asked about the Montana special election for its at-large House district earlier this year.) The raw data for the Delphi Analytica poll of the Michigan Senate race includes age brackets, gender, and city and state information for its respondents. The age brackets⁴ match Google Surveys’ age brackets. The way the geographic data is formatted is also similar.⁵

Also telling is the relatively frequent appearance in both polls of the answer “unknown” for both age and gender, and the number of respondents whose location is just listed as a state, with no city information. Google Surveys infers a user’s age, gender and region (with fairly accurate results) “based on the websites users visit and … [their] IP addresses.”⁶ Sometimes, though, these demographics cannot be inferred, which leads to some fields being marked as “unknown” or containing incomplete information. (Most pollsters ask survey-takers for their age and gender rather than inferring that information through other means.)

But some of the raw data in Delphi Analytica’s Kid Rock poll is, at the very least, strange. Kid Rock leads Stabenow 26 percent to 18 percent among the 72 respondents in Detroit, for instance. While the margin of error on this subsample is large and it’s certainly possible that Kid Rock’s unique candidacy would upset the normal electoral map, there just aren’t very many Republicans in Detroit — Hillary Clinton beat President Trump there 95 percent to 3 percent.

A look at the raw data released by Delphi Analytica also reveals a formatting inconsistency among those answering “I prefer not to answer.” With the way the data is sorted when it’s downloaded from the Delphi Analytica site, the responses appearing in the first half of the spreadsheet capitalize the “I”; for responses in the second half, the “I” is lowercase (“i prefer not to answer”). That shouldn’t happen if the results of one poll are automatically imported into a spreadsheet, as they typically are when a poll is conducted using Google Surveys. This suggests that either the results were changed⁷ or multiple surveys were combined. Either way, it’s the type of artifact you tend to see in polls that aren’t on the level.⁸

On the other hand, CSP Polling, which appears to have ties to the same Discord group that McDonald identified as hosting members of Delphi Analytica, may not even have done a Google Survey. McDonald claims CSP Polling was just an outright fake. CSP Polling did not respond to an email request for comment. Like Delphi Analytica, CSP Polling seemingly popped up overnight. Its Twitter account was created on the evening of May 25 — just before the polls closed for the Montana special election, which the group claimed to have surveyed. (The first post on the website wasn’t published until a week later.) CSP also supposedly conducted polls in June on the Paris Accords, the Virginia gubernatorial Democratic primary and the special congressional election in Georgia’s 6th district. Unlike Delphi Analytica, CSP Polling did not release raw data for any of these polls. The methodology information for all three polls says they were conducted online, but these writeups don’t give any details as to how this online sample was obtained.

As with Delphi Analytica, we found some Twitter accounts that appear to be associated with CSP Polling, but very little other evidence of who is running the site. McDonald said that CSP Polling was run by “Lance Stewart” (@topsznregime91) and “Kdawg” (@Kennnnnny). In a Twitter direct message, Stewart — who said that wasn’t his real name — claimed to have run CSP Polling’s social media for just “2-3 days and had no part in conducting the polling.” He said Kdawg wrote the press releases; that user posted at least two tweets regarding CSP Polling, but they have since been deleted.

Trusting polls is about more than real vs. fake

It’s fairly easy to dismiss both CSP Polling and Delphi Analytica. Maybe they conducted some polling and maybe they didn’t, but their lack of transparency and shady behavior make it easy to disqualify their work. But let’s say they were more transparent, and their data looked more legitimate. Even then it’s not clear whether news outlets should take their results seriously.

FiveThirtyEight has traditionally accepted any poll from any firm so long as we don’t have evidence the poll or pollster is an outright fake. But that’s in large part because, as of a few years ago, the logistics of fielding a poll were daunting enough that not just anybody could do it. It costs thousands of dollars to put a phone survey in the field, and until recently, phone surveys were pretty much the only game in town. To conduct them, you either needed to use call centers or somewhat cheaper automatic voice polling (aka “robopolls”).

Now putting out a “poll” is easy and relatively cheap. It costs just $60, for example, to ask one question to 400 respondents in New York state via Google Surveys.⁹ That opens the door for many people to field surveys who wouldn’t have been able to afford it in years past. This can certainly be a good thing: More pollsters means more data and more innovation. But that also means that we now have to ask ourselves not just whether a survey is real or fake, but also whether the person designing the survey knows what they are doing. You can think of it as a two-dimensional plane.

If a pollster is fake and unknown, then we should certainly ignore it. If a pollster isn’t actually fielding polls, we should ignore the results even if the pollster is professional (e.g., Research 2000, which settled a lawsuit accusing the firm of fabricating polling data). On the other hand, if the pollster is real and respected (e.g., Marist College or Monmouth University), then we obviously want to take its results seriously.

But it’s less clear how we should treat amateurs who are using platforms such as Google Surveys or SurveyMonkey Audience¹⁰ to field their political polls. These platforms provide users with the tools they need to conduct accurate political polls, but accuracy is not guaranteed if they don’t know what they’re doing.

Accurate political polling mainly comes down to ensuring that the voting population is properly represented in the poll. To get the right representation, pollsters typically weight survey respondents according to key variables such as age, education, gender, race, region within a state (for state polls¹¹) and registered or likely voter status. So, for example, if the sample of respondents your poll reaches has too few women relative to the electorate you’re trying to measure, you would count the answers of each woman you did reach a little extra. It’s extremely tricky to accurately weight a polling sample — even the pros disagree about how best to do it, and they sometimes get it wrong. Improper weighting was at least partially responsible for the polls underestimating President Trump’s strength in states with large populations of white voters without a college degree.

But the default for both the Google Surveys and SurveyMonkey Audience platforms is to provide data only on age and gender (along with city, for Google Surveys), which are weighted to match the general population, not the voting population. If you’re conducting a poll, you can ask respondents about their race and education, but even when you have those variables, they must be weighted correctly to match the voting population or they won’t give an accurate picture of the electorate.¹² And for that, the pollster is on their own.

Google Surveys and SurveyMonkey ultimately have no real control over how people who use their platforms report or weight the results. People can conduct polls on these platforms, reweight the results any way they want, fail to give credit to Google Surveys or SurveyMonkey, and still rightfully claim that a real poll was conducted.

To be clear, I’m not saying that political surveys conducted on Google Surveys or SurveyMonkey Audience are inherently flawed. Far from it. Both companies have published professional, legitimate polls of their own, and third parties have conducted high-quality polls on both platforms.¹³ Red Oak Strategies did very good work in the 2016 presidential election using the Google Surveys platform. NBC News has used SurveyMonkey Audience for some of its political surveys.¹⁴ Moreover, traditional surveys are far from perfect and can wind up with an unrepresentative sample as a result of problems like non-response bias,¹⁵ which no amount of weighting can correct.

Indeed, the answer to pollsters like Delphi Analytica isn’t for the media to just unthinkingly winnow down the number of pollsters they report on. If they do that, they’ll miss pollsters who correctly gauge the electorate even when many traditional pollsters don’t — pollsters like Robert Calahy, who runs the Trafalgar Group and who was one of the most accurate pollsters in 2016; he was one of the few who correctly anticipated Trump’s victory. He says he’s had some difficulty getting accepted by some members of the media because he isn’t as well-known as more traditional pollsters. He told me that he completely agreed that the existence of fake firms makes it easier for skeptical members of the media to dismiss all pollsters they are unfamiliar with. Ignoring unknown pollsters like Calahy could deprive readers of information that big-name pollsters are missing. Instead, what the media needs to do is examine each poll and pollster individually and make a judgment call.

A fake poll can have real influence

For most people, no individual poll will have much of an impact. Most voters go about their lives not paying attention to every single poll that is or isn’t reported by the press. But make no mistake: A rogue poll or pollster can influence an election.

As Adam Geller, a Republican pollster who worked on the Trump campaign, told me, public polls can create news because “they are easy stories to write.” But, he said, “there is far too little scrutiny on the methodology of the poll. To most journalists, a poll is a poll is a poll.”

Public polls can also influence donors, Geller says. Donors don’t want to back a likely loser. Voters themselves can be influenced as well. For example, in a primary campaign where voters are trying to decide between ideologically similar candidates in a large field, voters may take into account who they think has the best chance of winning. A fake poll could affect that calculus.

That influence, of course, is predicated on the poll getting attention. But it’s not difficult to attract press coverage. As Steve Berman, a writer at the conservative website The Resurgent who wrote a skeptical take on the poll, told me, “Anyone in the entire political blogosphere is maybe 1 or 2 degrees of separation from anyone else. The trust network is fairly strong.” In other words, once a story appears in one place, it’s likely to appear in many others because people believe a reputable website wouldn’t get fooled by a fake poll. The good news is that some of the initial reports on the Delphi Analytica poll were taken down or amended once word spread that something didn’t smell right about it. The bad news, as Berman mentioned to me, is that there’s no real way to stop the initial spread of these shoddy polls — we can only try to limit their impact.

In this case, Delphi Analytica’s claims may have made Kid Rock more seriously consider entering the Michigan Senate race. He retweeted the results, after all. And while the singer has not made any official moves toward running for Senate, such as filing a statement of candidacy, it wasn’t too long after Delphi Analytica published its poll that Kid Rock said he’d take a “hard look” at a Senate bid and that former New York Gov. George Pataki endorsed him.

Think about that for a second. A poll that may not even have been conducted could wind up being at least partially responsible for the election of a musician to the U.S. Senate. It’s pretty amazing.

Footnotes

@James_st1 also tweeted at me on July 28 that Delphi Analytica’s Michigan Senate poll was “real,” though that tweet has since been deleted.
It should be noted that two legitimate polling organizations also found a close race for Stabenow’s Senate seat. Trafalgar Group, in a sample that had far fewer undecideds than appeared in Delphi Analytica’s poll, found Kid Rock ahead 49 percent to 46 percent. Target-Insyght gave Stabenow a 50 percent to 42 percent advantage.
Lee didn’t say whether the company used SurveyMonkey’s Audience panel (in which a customer pays to receive responses from a cross-section of Americans) or whether Delphi Analytica sent out a link to a poll created on the SurveyMonkey platform, which wouldn’t guarantee a representative sample.
18-24, 25-34, 35-44, 45-54, 55-64 and 65+.
Both sets of data use two-letter state abbreviations, followed by a hyphen and then the name of the city. Interestingly, the column header for the horse race question in both the Delphi Analytica and Montana surveys (“Question #1 Answer”) is also identical.
Note that someone conducting a poll could still ask about demographic information in the poll itself if they wanted to ensure that Google’s inferred information is accurate.
If the goal were to manipulate markets, it could be useful to both conduct an actual poll so you knew which way the race was really going, and to publish a doctored version of that poll to trick other bettors.
Keep in mind too that Delphi Analytica did not release raw data for a number of their polls, including their two most recent surveys, and we spotted some inconsistencies in the data the group has released from its national surveys. For example, sometimes the data includes each respondent’s state (information we would expect to see in the data from a Google Surveys poll), but the data from the group’s poll on transgender Americans in the military included only respondents’ cities, not states. In other words, either that poll was not done using Google Surveys or the column that lists states was deleted for unclear reasons. It’s possible that there’s an innocent explanation for this inconsistency, but it’s just one of a number of suspicious data points.
Google Surveys often appear on local news websites — you may have seen them if you’ve ever been asked to answer a question so you can read the rest of an article — and Google pays the publication money for every completed survey. Access to the site’s content serves as the incentive to get users to take the poll. Meanwhile, SurveyMonkey’s Audience panel, which would be the proper way to field a political survey of the general public, entices people to participate by offering to donate to each respondent’s preferred charity if they complete a survey.
Again, SurveyMonkey’s Audience panel would be the proper way to field a political survey of the general public. SurveyMonkey does offer other products to public and private entities.
Google Surveys guarantees “a representative distribution of regions” whenever possible in national surveys. SurveyMonkey Audience national surveys aren’t weighted by region, but the panel has enough members in each region that the results should reflect Census figures for demographic distributions.
For most non-political surveys, the lack of weighting on many of these variables isn’t a big deal because once you leave the realm of politics, these demographic factors are usually far less determinative of a person’s preferences. For example, if a company is asking people what kind of breakfast cereal they like, those preferences probably don’t vary much between people with and without college degrees.
SurveyMonkey has conducted polling for FiveThirtyEight.
During the 2016 campaign, SurveyMonkey ran horse race polls sponsored by NBC News that used a different collection method than is available to the public.
That is, the people polled within each demographic group hold opinions that don’t match those of the group at large.

Who runs Delphi Analytica?

Was the poll real?

Trusting polls is about more than real vs. fake

A fake poll can have real influence

Footnotes

Comments