Social media might give the kind of insight into political public opinion that could help improve the accuracy of election analyses and forecasts.
Purush Papatla, a UWM professor of marketing, helped to lead the Northwestern Mutual Data Science Institute’s Elecurator research project, which analyzed social media posts as a way to help identify important issues among voters in the 2020 presidential campaign. Papatla and the team ended up using that work to take a stab at predicting the outcome of the election.
They might be on to something.
The Elecurator project forecast that Democratic challenger Joe Biden had a 95% chance of winning 282 to 311 Electoral College votes in the presidential election, while President Donald Trump had a 95% chance of winning 226 to 253 Electoral College votes.
A presidential candidate must reach 270 to claim victory. Biden defeated Trump 306-232, with Congress scheduled to count the electoral votes on Jan. 6 as part of the formal process of declaring the winner.
“I knew that the forecast was going to be good, but I was really surprised at how on target it was,” said Papatla, co-director of the Data Science Institute, which is a collaboration with Northwestern Mutual and Marquette University.
An ‘aha’ moment
That’s in part because Elecurator wasn’t intended to forecast the election in the first place.
“Our focus throughout the length of the project was entirely on identifying issues of importance to the electorate using social media. Forecasting wasn’t our goal,” Papatla said.
The project tracked social media posts daily starting in June. As the research wore on through the summer and into the fall, Papatla thought about whether the work also could be used to produce an election forecast.
“The first time we tried this approach out was toward the end of October, which is when we felt that it held promise. It was essentially our ‘aha’ moment,” Papatla said.
The team posted the first forecast online on Nov. 2, then updated it on Election Day and the day after the election. Papatla said that while more research is needed to incorporate social media analysis into polling, Elecurator offers an encouraging start.
Accuracy of polling
Traditional polling has drawn scrutiny over the last two presidential elections, especially for underestimating support for President Trump in key battleground states. The process typically involves pollsters asking questions in person, through an online survey, or over the phone.
Papatla said previous research has shown that people responding to polls might tell pollsters answers they think that pollsters want to hear. Poll respondents also may not answer all questions honestly because they don’t want to share their actual opinions with a stranger.
“I have wondered for some time whether traditional methods of polling are best at this time,” Papatla said.
Platforms like Twitter and Reddit allow users to post electronically with near anonymity, allowing them to express opinions they may not necessarily feel comfortable doing in person or over the phone.
“Social media is where people can exhibit their true feelings without inhibition,” Papatla said. “That’s the advantage of social media.”
Positives and negatives
The Elecurator forecast involved the analysis of tens of millions of posts from Facebook, Twitter, Reddit and other sites, and applying algorithms to determine each message’s “polarity,” or how positive or negative the message was toward a candidate.
The analysis focused on posts related to the COVID-19 pandemic, economy and race relations, which many traditional polls identified as the top three issues among voters. Posts were collected through a third-party provider, which also included geographic identifiers that Elecurator researchers could use to assign a post to a state.
“From the millions of tweets, we were able to calculate the sentiment towards each issue towards each candidate in each state,” Papatla said.
Polarity scores for each issue were then averaged together for each candidate in each state, creating an average score that the Elecurator team used to allocate electoral votes.
More research needed
Papatla noted issues with social media that require more research, such as the reliability of geographic locations associated to the accounts of people who post, and what to do with posts coming from automated accounts.
“Clearly, it’s not all positives. There are some issues with social media as well,” Papatla said. “There’s a lot of fake news. There’s a lot of bot activity.”
He noted though that the impact of automated posts is lessened when included in an analysis of a much larger pool of tens of millions of posts. Larger sample sizes tend to yield more accurate results, much like in traditional polling.
“This is clearly the early stages in terms of this kind of approach and process,” he said, “but I think it holds a lot of promise.”