Post developed by Nicholas Valentino and Katherine Pearson
Survey research is an ever-evolving field. Technology has increased the number of ways to reach respondents, while simultaneously reducing response rates by freeing people from the constraints of one land-line telephone per household. Surveys remain an essential tool for making inferences about societal and political trends, so many survey researchers offer incentives to survey respondents in order to ensure a large and representative sample. Financial incentives to complete surveys, in turn, entice some people to respond to a large number of online surveys on a regular basis, essentially becoming professional survey respondents.
Survey methodologists have carefully considered the ways that survey modes may impact the way people answer questions. Talking to a real person is different than answering questions online. But less is known about how individual factors bias participation in surveys in the first place. For example, might personality traits shape your willingness to agree to answer a survey online versus someone who comes to your door? New work from researchers at the University of Michigan and Duke suggests in fact this is the case.
In order to examine the personality traits of survey respondents, the research team used data from the 2012 and 2016 American National Election Studies (ANES). During these two study periods, the ANES ran parallel and face-to-face surveys. In both years, the ANES included the 10-item personality inventory (TIPI), which consists of pairs of items asking respondents to assess their own traits. Based on the responses, respondents build a profile of “the Big Five” personality traits: openness to experience, conscientiousness, extraversion, agreeableness, and emotional stability.
Big Five traits with corresponding TIPI qualities
Openness to experience
Open to new experiences, complex
Anxious, easily upset
Calm, emotionally stable
Researchers were able to compare responses to the TIPI with measures of political predispositions and policy preferences, based on responses to questions on the ANES. These include partisanship, liberal–conservative ideology, issue self-placements, and other measures of political orientation.
Based on these data, the authors found that respondents in the online samples were, on average, less open to experience and more politically conservative on a variety of issues compared to those responding to face-to-face surveys. They also found that the more surveys a respondent completed, the lower they scored on measures of openness. Given that professionalized survey respondents comprise the majority of online survey samples, these results suggest caution for those who would like to generalize results to the population at large. It is not enough to balance samples on simple demographics. Attitudinal and personality based differences might also lead online sample estimates to diverge from the truth.
It is difficult to say whether online survey respondents or face-to-face respondents are more representative of personality traits in the general population. If personality is a factor in whether someone will participate in a survey, that might bias both types of samples. However, the authors note that the data suggest that professional online samples are the outlier. They find “that samples based on fresh cross-sections, both face-to-face and online, yield better population estimates for personality and political attitudes compared to professionalized panels.” While it may be possible to mitigate the potential sampling bias of personality traits, it is important that survey researchers understand the role that personality traits play in professional online samples.
In every U.S. presidential election since 1948, the American National Election Studies (ANES) has conducted pre- and post-election surveys of a large representative sample of American voters. ANES participant interviews looked different in 2020 than they did in the past; the COVID19 pandemic made traditional face-to-face interviews impractical and risky. The study team began planning for the extraordinary circumstances in March, without any idea what the conditions would be when interviews began in August. The team pivoted nimbly to redesign the study even as the onset of data collection approached.
The majority of interviews in 2020 were completed as web surveys, some following an online format similar to one used in 2016, and others using an innovative mixed-mode design. Respondents to the mixed-mode surveys were randomly assigned either to complete the questionnaire by themselves online, or to take the survey with a live interviewer via a Zoom video link. Few surveys conduct live video interviews, but the ANES study team felt that it was critical to explore the use of this technology as a potential means of balancing issues of cost, continuity, and data quality.
To answer online surveys, respondents must have reliable access to the Internet and comfort using computers. Under normal circumstances, people without access to computers or the Internet in their homes can gain access in public settings like libraries or at their workplace. With many of these places closed due to the pandemic, online access became a bigger challenge. In mixed-mode cases where it was difficult to complete a web or video interview, interviewers contacted the respondents to secure a phone interview. Providing phone interviews helped the team strengthen sample quality by reaching respondents without access to the Internet as well as those who are less comfortable using computers.
Data collection for the 2020 surveys, out of necessity, departed significantly from the practices of the past 70 years of the ANES. The study team will continue to monitor and address the implications of these changes. In the end, the team was pleased to field a very high quality survey with relatively high response rates, thoroughly vetted questions, and the largest sample in the history of ANES.
Pre-election interviews began in August 2020. The pre-election questionnaire is available on the ANES website. The questionnaire includes time series questions dating back to the earliest days of the ANES survey, as well as new questions that reflect more recent developments in the study of American politics. The ANES team must always be prepared to add a few questions late in the design process to capture substantial developments in the presidential campaign or American society. In 2020 the survey added questions about election integrity, urban unrest, and COVID-19, among other topics.
The investigators, ANES staff, and their survey operations partners at Westat monitored the data collection closely, in case further adjustments in procedures or sample were required. The final pre-election sample consists of over 8,200 complete or sufficient-partial interviews. This includes a reinterview panel with the respondents from the ANES 2016 Time Series. Over 2,800 respondents from the 2016 study were reinterviewed, more than three quarters of the original group.
Post-election interviews began on November 8, 2020, and will be completed on January 4, 2021. This post-election effort includes additional respondents who took part in the 2000 study of the General Social Survey (GSS). Due to the pandemic-altered timing of the GSS data collection, it was not possible to interview these individuals prior to the election. However, these respondents completed nearly all of the ANES post-election interview, plus almost ten minutes of critical questions that appeared on the ANES pre-election interview, and several additional questions suggested by the GSS team.
ANES staff will continue to review and clean the data into the new year, including checks of respondent eligibility that may alter the final sample in modest ways. Pending this review, the team expects response rates to come in slightly below the 2016 web response rates.
Overall, despite the challenges of this past year, the ANES study team was able to gather robust data from a large probability sample of Americans, extending the longest-running, most in-depth, and highest quality survey of US public opinion and voting behavior, at a critical juncture for American society and democracy. The team will continue to share updates, here and on the ANES website, as data from this survey become available.
This post looks at the opinions of Twitter users surrounding the first Presidential Debate. We look at content containing at least one debate hashtag, shared immediately before, during, and after the debate; and we determine the “stance” or opinion (for or against) of each tweet towards Biden and Trump.
The figure below shows the average proportion of expressed support or opposition for the candidate every minute of the debate from 8pm (20:00) to 11:30pm (23:30). A score above zero indicates a net positive stance towards the candidate. A score below zero indicates a net negative stance.
Presidential Debate 1: Stance of Candidates on Twitter
We see that in the hour before the debate begins, both candidates have a net negative stance. In other words, more opinions against each candidate are being shared than are opinions for each candidate. At around the 11 minute mark in the debate (roughly 21:11), pro-Biden expressions begin increasing, and continues to increase until the overall stance is in support of Biden. In contrast, around the same time, stance towards Trump decreases and continues to decrease for the first 10 minutes.
Over the course of the debate there are specific moments that help and hurt each of the candidates. When there is perceived bickering, there is usually a decline in stance for both candidates, although there are exceptions. The moment in which Trump received the most support was when he spoke about judges. Biden’s best moment was when he discussed race relations and the need to support black Americans.
By the end of the debate, the stance of Twitter discussion towards Biden had increased by 0.5 – a striking shift. He clearly benefited from the debate, at least in the short term amongst Twitter users. In contrast, the stance of Twitter discussion towards Trump decreased by approximately 0.2. Even as there was a good deal of opposition towards Trumps expressed immediately before the debate, there was even more negativity towards him at the end of the debate.
It is worth noting that within an hour of the debate the expressed stance towards Trump returned to pre-debate levels. These are decidedly negative, of course; but the additional negative impact of the debate on Twitter discussion of Trump may have been short-lived. The same is not true for Biden. The hours surrounding the debate saw a marked shift in expressed stance towards Biden, from by-minute averages that were anti-Biden to clearly pro-Biden. The shift is evident only 10 minutes into the 90-minute debate, and durable for the hour following the debate as well.
Twitter is by no means an accurate representation of public opinion more broadly – we must be sure to interpret these results as indicating the debate impact on Twitter discussion, not the public writ large. That said, where Twitter is concerned it seems relatively clear that Biden ‘won’ the debate.
Information about the analysis:
This analysis was conducted using approximately 1.3 million tweets that contained at least of the debate hashtags. We collect posts using the Twitter Streaming API. We use the core debate hashtags for this analysis, e.g. #debates2020, #presidentialdebate2020, etc. We determine if the tweet showed support, opposition, or neither for each candidate. For each minute, we compute an aggregate stance score as follows: Stance Score = (# Support – # Oppose) / (# of tweets that minute having a stance). To determine the stance itself, we trained a BERT fined tune model with a single layer on 5 million posts related to election 2020. We also had three people label 1000 tweets with stance to further improve our model.
Political science has been enriched by the use of social media data. However, automated text-based classification systems often do not capture image content. Since images provide rich context and information in many tweets, these classifiers do not capture the full meaning of the tweet. In a new paper presented at the 2020 Annual Meeting of the American Political Science Association (APSA), Patrick Wu, Alejandro Pineda, and Walter Mebane propose a new approach for analyzing Twitter data using a joint image-text classifier.
Human coders of social media data are able to observe both the text of a tweet and an attached image to determine the full meaning of an election incident being described. For example, the authors show the image and tweet below.
If only the text is considered, “Early voting lines in Palm Beach County, Florida #iReport #vote #Florida @CNN”, a reader would not be able to tell that the line was long. Conversely, if the image is considered separately from the text, the viewer would not know that it pictured a polling place. It’s only when the text and image are combined that the message becomes clear.
A new framework called Multimodal Representations Using Modality Translation (MARMOT) is designed to improve data labeling for research on social media content. MARMOT uses modality translation to generate captions of the images in the data, then uses a model to learn the patterns between the text features, the image caption features, and the image features. This is an important methodological contribution because modality translation replaces more resource-intensive processes and allows the model to learn directly from the data, rather than on a separate dataset. MARMOT is also able to process observations that are missing either images or text.
MARMOT was applied to two datasets. The first dataset contained tweets reporting election incidents during the 2016 U.S. general election, originally published in “Observing Election Incidents in the United States via Twitter: Does Who Observes Matter?” The tweets in this dataset report some kind of election incident. All of the tweets contain text, and about a third of them contain images. MARMOT performed better at classifying the tweets than the text-only classifier used in the original study.
In order to test MARMOT against a dataset containing images for every observation, the authors used the Hateful Memes dataset released by Facebook to assess whether a meme is hateful or not. In this case, a multimodal model is useful because it is possible for neither the text nor the image to be hateful, but the combination of the two may create a hateful message. In this application, MARMOT outperformed other multimodal classifiers in terms of accuracy.
As more and more political scientists use data from social media in their research, classifiers will have to become more sophisticated to capture all of the nuance and meaning that can be packed into small parcels of text and images. The authors plan to continue refining MARMOT, and expand the models to accommodate additional elements such as video, geographical information, and time of posting.
“Not which ones, but how many” is a phrase used in list experiments instruction, where researchers instruct participants, “After I read all four (five) statements, just tell me how many of them upset you. I don’t want to know which ones, just how many.” In retrospect, I was surprised to see that this phrase encapsulates not only the key research idea, but also my fieldwork adventure: not which plans could go awry, but how many. The fieldwork experience could be frustrating at times, but it has led me to uncharted terrain and brought insights into the research contexts. The valuable exposure would not have been possible without support from the Roy Pierce Award and guidance from Professor Yuki Shiraito.
Research that I conducted with Yuki Shiraito explores the effect of behavior on political attitudes in authoritarian contexts to answer the question: does voting for autocracy reinforce individual regime support? To answer this question, two conditions need to be true. First, people need to honestly report their level of support before- and after- voting in authoritarian elections. Second, voting behavior needs to be random. Neither situation is probable in illiberal autocracies. Our project addresses these methodological challenges by conducting a field experiment that combines a list experiment and a randomized encouragement design in China.
In this study, list experiments are used instead of direct questions to measure the respondents’ attitudes towards the regime in the pre- and post-election surveys. The list experiment is a survey technique to mitigate preference falsification by respondents. Although the true preference of individual respondents will be hidden, the technique allows us to identify the average level of support for the regime within a group of respondents. In addition, we employ a randomized encouragement design where get-out-the-vote messages are randomly assigned, which help us estimate the average causal effect of a treatment. For effect moderated by prior support for the regime, we estimate the probability of the prior support using individual characteristics and then estimate the effect for the prior supporters via a latent variable model.
While the theoretical part of the project went smoothly and the simulation results were promising, the complication of fieldwork exceeded my expectation. For the list experiment survey, the usually reticent respondents started asking questions about the list questions immediately after the questionnaires were distributed. Their queries took the form of “I am upset by option 1, 2, and 4, so what number should I write down here?” This was not supposed to happen. List experiments are developed to conceal individual respondents’ answers from researchers. By replacing the questions of “which ones” with the question of “how many,” respondents’ true preference is not directly observable, which makes it easier for them to answer sensitive questions honestly. Respondents’ eagerness to tell me their options directly defeats the purpose of this design. Later I learned from other researchers that the problem I encountered was common in list experiment implementation regardless of research contexts and types of respondents.
The rationale behind respondents’ desire to share their individual options despite being given a chance to hide them is thought-provoking. Is it because of the cognitive burden of answering a list question, which is not a familiar type of questions to respondents? Or is it because the sensitive items, despite careful construction, raise the alarm? Respondents are eager to specify their stance on each option and identify themselves as regime supporters: they do not leave any room for misinterpretation. To ease the potential cognitive burden, we will try a new way to implement the list experiment in a similar project on preference falsification in Japan. We are looking forward to seeing if it improves respondents’ comprehension of the list question setup. The second explanation is more concerning, however. It suggests the scope condition of list experiments as a valid tool to elicit truthful answers from respondents. Other more implicit tools, such as endorsement experiments, may be appropriate in those contexts to gauge respondent’s preference.
Besides the intricacies of the list experiment, carrying out encouragement design on the ground is challenging. We had to modify the behavioral intervention to adapt needs from our local collaborators, and the realized sample size was only a fraction of the negotiated size initially. Despite the compromises, the implementation is imbued with uncertainty: meetings were postponed or rescheduled last minutes, instructions from local partners are sometimes inconsistent and conflictual. The frustration was certainly real. But the pain makes me cognizant of judgment calls researchers have to make in the backstage. The amount of effort required to produce reliable data is admirable. And as a consumer of data, I should always interpret data with great caution.
While the pilot study does not lead to a significant finding directly, the research experience and the methods we developed have informed the design of a larger project that we are currently doing in Japan.
I always thought of doing research as establishing a series of logical steps between a question and an answer. Before I departed for the pilot study, I made a detailed timeline for the project with color-coded tasks, flourish-shaped arrows pointing at milestones of the upcoming fieldwork. When I presented this plan to Professor Shiraito, he smiled and told me that “when doing research, it is generally helpful to think of the world in two ways: the ideal world and the real world. You should be prepared for both.” Wise words. Because of this, I am grateful for the Roy Pierce Award for offering the opportunity to catch a glimpse of the real world. And I am indebted to Professor Shiraito for helping me see the potential of attaining the ideal world with intelligence and appropriate tools.
Christian Sandvig, the Director of the new Center for Ethics, Society, and Computing (ESC), says he developed this new center “to reconcile the fact that I love computers, but I’m horrified by some of the things we do with them.” ESC is dedicated to intervening when digital media and computing technologies reproduce inequality, exclusion, corruption, deception, racism, or sexism. The center was officially launched at an event on January 24, 2020. Video of the event is available here.
The associate director of ESC, Silvia Lindtner, elaborated on ESC’s mission at the event. “I’ve learned over the years not to shy away from talking about things that are uncomfortable,” she said. “This includes talking about things like sexism, racism, and various forms of exploitation – including how this involves us as researchers, and how we’ve experienced these ourselves.”
ESC is sponsored by the University of Michigan School of Information, Center for Political Studies (CPS), and the Department of Communication and Media. CPS Director Ken Kollman called the new center “an exciting, interdisciplinary effort to ask and address challenging questions about technology, power, and inequality.” Thomas Finholt, Dean of the School of Information, said, “if you look at the world around us there are a seemingly unlimited number of examples where individual leaders or contributors would have benefitted dramatically from the themes this center is going to take on.”
The wide range of disciplines represented among the ESC faculty is essential to its mission. “To have people in computer science, engineering, social science, and humanities interacting together on questions about the impacts of technology strikes me as the kind of necessary, but all too rare, collaborative efforts for generating new ideas and insights,” Kollman said.
Christian Sandvig, Thomas Finholt, and Sylvia Lindtner cut the ribbon to launch the ESC Center
The launch event was comprised of two panel discussions featuring notable experts in technology and its applications. The first panel, “Accountable Technology — An Oxymoron?” explored the ways that big companies, the media, and individual consumers of technology hold the tech industry accountable for issues of equity and fairness. Pulitzer Prize-winning journalist Julia Angwin highlighted journalists’ role in investigating and framing coverage of tech, including her work to launch a publication dedicated to the investigation of the technology industry. Jen Gennai, Google executive responsible for ethics, fielded questions from the audience about accountability. danah boyd, Principal Researcher at Microsoft Research and the founder of Data & Society, and Marc DaCosta, co-founder and chairman of Enigma, rounded out the panel, which was moderated by Sandvig.