For the past two years, a team of data science experts have been experimenting with offering expert office hours to facilitate the adoption of new methods and technologies across the Institute for Social Research (ISR). These CoderSpaces provide immediate research support and offer hands-on learning opportunities for participants who wish to grow their coding and data science skills. The aim is to foster a casual learning and consulting environment that welcomes everyone regardless of skill level.
CoderSpaces are one way to help researchers thrive in an environment that is becoming increasingly complex. With the ongoing digitization of our daily lives, scholars are gaining access to new types of data streams that have not been traditionally available in their disciplines. For example, social scientists in the ISR at the University of Michigan have started to explore the ways in which virtual interactions on social media platforms can inform the scientific inquiry of socio-behavioral phenomena spanning many aspects of our lives, including election forensics, political communication, parenting, or insights gained from survey research.
Processing and analyzing novel types and ever bigger quantities of data requires that faculty, research staff, and students incorporate new research technologies and methodologies in their scientific toolkits. For example, researchers may need to move computationally intense analyses to a high performance computing cluster, which requires familiarity with batch processing, a command line interface, and advanced data storage solutions. Or researchers may be confronted with understanding and implementing natural language processing and machine learning to systematically retrieve information from large amounts of unstructured text.
Researchers who embark on the journey of exploring new technologies or methodology often can not fall back on curricula and training opportunities provided by their disciplinary peers. The relevant learning resources still need to be developed – potentially by themselves one day. To bridge training gaps, scholars look to example applications in other disciplines, engage in interdisciplinary research collaborations to access necessary expertise, and solicit help from available support units on campus to make methodological and technological innovations possible.
CoderSpaces provide just this kind of support. The sessions are hosted by faculty, research staff, and students who are willing to share their methodological and programming expertise with others. Initially, CoderSpaces were limited to the ISR community. Currently, anyone at the University of Michigan is welcomed to join, which has allowed us to diversify and broaden the available expertise and research applications. The weekly sessions were originally organized as in-person gatherings at the ISR with the intent to venture out to other campus locations. In March 2020, CoderSpaces moved to a virtual format facilitated by Zoom video-conferencing and a Slack communication space. Going virtual turned out to be a blessing in disguise as it enabled anyone at the university to participate regardless of their physical location, helping us broaden our reach across U-M departments and disciplines.
Participants join an ongoing Zoom meeting at the scheduled weekly times. The hosts on the call field questions and may use the breakout room feature to assist multiple participants simultaneously. For example, Bryan Kinzer, a PhD student in Mechanical Engineering, attended CoderSpaces a few times to set up and run a Singularity container. He says of his experience: “The hosts were helpful and patient. My issue was not a super easy quick fix, but they were able to point me in the right direction eventually getting the issue resolved. When I came back the following week they remembered my case and were able to pick right back up where I left off.”
Paul Schulz, a senior consulting statistician and data scientist for ISR’s Population Dynamics and Health Program (PDHP), has now been serving as a host since the CoderSpaces were launched. He describes the weekly CoderSpaces as “an enriching experience that has allowed me and the other PDHP staff members to socialize and broaden our network among other people on campus who work in the data and technical space. By sharing our technical skills and knowledge with attendees, we are providing a service. But we have also been able to improve our own skills and expertise in these areas by being exposed to what others across campus are doing. By fostering these types of informal collaborations and shared experiences, I think that the CoderSpaces have been a win-win for both attendees and hosts alike.”
Post developed by Nicholas Valentino and Katherine Pearson
Survey research is an ever-evolving field. Technology has increased the number of ways to reach respondents, while simultaneously reducing response rates by freeing people from the constraints of one land-line telephone per household. Surveys remain an essential tool for making inferences about societal and political trends, so many survey researchers offer incentives to survey respondents in order to ensure a large and representative sample. Financial incentives to complete surveys, in turn, entice some people to respond to a large number of online surveys on a regular basis, essentially becoming professional survey respondents.
Survey methodologists have carefully considered the ways that survey modes may impact the way people answer questions. Talking to a real person is different than answering questions online. But less is known about how individual factors bias participation in surveys in the first place. For example, might personality traits shape your willingness to agree to answer a survey online versus someone who comes to your door? New work from researchers at the University of Michigan and Duke suggests in fact this is the case.
In order to examine the personality traits of survey respondents, the research team used data from the 2012 and 2016 American National Election Studies (ANES). During these two study periods, the ANES ran parallel and face-to-face surveys. In both years, the ANES included the 10-item personality inventory (TIPI), which consists of pairs of items asking respondents to assess their own traits. Based on the responses, respondents build a profile of “the Big Five” personality traits: openness to experience, conscientiousness, extraversion, agreeableness, and emotional stability.
Big Five traits with corresponding TIPI qualities
Openness to experience
Open to new experiences, complex
Anxious, easily upset
Calm, emotionally stable
Researchers were able to compare responses to the TIPI with measures of political predispositions and policy preferences, based on responses to questions on the ANES. These include partisanship, liberal–conservative ideology, issue self-placements, and other measures of political orientation.
Based on these data, the authors found that respondents in the online samples were, on average, less open to experience and more politically conservative on a variety of issues compared to those responding to face-to-face surveys. They also found that the more surveys a respondent completed, the lower they scored on measures of openness. Given that professionalized survey respondents comprise the majority of online survey samples, these results suggest caution for those who would like to generalize results to the population at large. It is not enough to balance samples on simple demographics. Attitudinal and personality based differences might also lead online sample estimates to diverge from the truth.
It is difficult to say whether online survey respondents or face-to-face respondents are more representative of personality traits in the general population. If personality is a factor in whether someone will participate in a survey, that might bias both types of samples. However, the authors note that the data suggest that professional online samples are the outlier. They find “that samples based on fresh cross-sections, both face-to-face and online, yield better population estimates for personality and political attitudes compared to professionalized panels.” While it may be possible to mitigate the potential sampling bias of personality traits, it is important that survey researchers understand the role that personality traits play in professional online samples.
In every U.S. presidential election since 1948, the American National Election Studies (ANES) has conducted pre- and post-election surveys of a large representative sample of American voters. ANES participant interviews looked different in 2020 than they did in the past; the COVID19 pandemic made traditional face-to-face interviews impractical and risky. The study team began planning for the extraordinary circumstances in March, without any idea what the conditions would be when interviews began in August. The team pivoted nimbly to redesign the study even as the onset of data collection approached.
The majority of interviews in 2020 were completed as web surveys, some following an online format similar to one used in 2016, and others using an innovative mixed-mode design. Respondents to the mixed-mode surveys were randomly assigned either to complete the questionnaire by themselves online, or to take the survey with a live interviewer via a Zoom video link. Few surveys conduct live video interviews, but the ANES study team felt that it was critical to explore the use of this technology as a potential means of balancing issues of cost, continuity, and data quality.
To answer online surveys, respondents must have reliable access to the Internet and comfort using computers. Under normal circumstances, people without access to computers or the Internet in their homes can gain access in public settings like libraries or at their workplace. With many of these places closed due to the pandemic, online access became a bigger challenge. In mixed-mode cases where it was difficult to complete a web or video interview, interviewers contacted the respondents to secure a phone interview. Providing phone interviews helped the team strengthen sample quality by reaching respondents without access to the Internet as well as those who are less comfortable using computers.
Data collection for the 2020 surveys, out of necessity, departed significantly from the practices of the past 70 years of the ANES. The study team will continue to monitor and address the implications of these changes. In the end, the team was pleased to field a very high quality survey with relatively high response rates, thoroughly vetted questions, and the largest sample in the history of ANES.
Pre-election interviews began in August 2020. The pre-election questionnaire is available on the ANES website. The questionnaire includes time series questions dating back to the earliest days of the ANES survey, as well as new questions that reflect more recent developments in the study of American politics. The ANES team must always be prepared to add a few questions late in the design process to capture substantial developments in the presidential campaign or American society. In 2020 the survey added questions about election integrity, urban unrest, and COVID-19, among other topics.
The investigators, ANES staff, and their survey operations partners at Westat monitored the data collection closely, in case further adjustments in procedures or sample were required. The final pre-election sample consists of over 8,200 complete or sufficient-partial interviews. This includes a reinterview panel with the respondents from the ANES 2016 Time Series. Over 2,800 respondents from the 2016 study were reinterviewed, more than three quarters of the original group.
Post-election interviews began on November 8, 2020, and will be completed on January 4, 2021. This post-election effort includes additional respondents who took part in the 2000 study of the General Social Survey (GSS). Due to the pandemic-altered timing of the GSS data collection, it was not possible to interview these individuals prior to the election. However, these respondents completed nearly all of the ANES post-election interview, plus almost ten minutes of critical questions that appeared on the ANES pre-election interview, and several additional questions suggested by the GSS team.
ANES staff will continue to review and clean the data into the new year, including checks of respondent eligibility that may alter the final sample in modest ways. Pending this review, the team expects response rates to come in slightly below the 2016 web response rates.
Overall, despite the challenges of this past year, the ANES study team was able to gather robust data from a large probability sample of Americans, extending the longest-running, most in-depth, and highest quality survey of US public opinion and voting behavior, at a critical juncture for American society and democracy. The team will continue to share updates, here and on the ANES website, as data from this survey become available.
This post looks at the opinions of Twitter users surrounding the first Presidential Debate. We look at content containing at least one debate hashtag, shared immediately before, during, and after the debate; and we determine the “stance” or opinion (for or against) of each tweet towards Biden and Trump.
The figure below shows the average proportion of expressed support or opposition for the candidate every minute of the debate from 8pm (20:00) to 11:30pm (23:30). A score above zero indicates a net positive stance towards the candidate. A score below zero indicates a net negative stance.
Presidential Debate 1: Stance of Candidates on Twitter
We see that in the hour before the debate begins, both candidates have a net negative stance. In other words, more opinions against each candidate are being shared than are opinions for each candidate. At around the 11 minute mark in the debate (roughly 21:11), pro-Biden expressions begin increasing, and continues to increase until the overall stance is in support of Biden. In contrast, around the same time, stance towards Trump decreases and continues to decrease for the first 10 minutes.
Over the course of the debate there are specific moments that help and hurt each of the candidates. When there is perceived bickering, there is usually a decline in stance for both candidates, although there are exceptions. The moment in which Trump received the most support was when he spoke about judges. Biden’s best moment was when he discussed race relations and the need to support black Americans.
By the end of the debate, the stance of Twitter discussion towards Biden had increased by 0.5 – a striking shift. He clearly benefited from the debate, at least in the short term amongst Twitter users. In contrast, the stance of Twitter discussion towards Trump decreased by approximately 0.2. Even as there was a good deal of opposition towards Trumps expressed immediately before the debate, there was even more negativity towards him at the end of the debate.
It is worth noting that within an hour of the debate the expressed stance towards Trump returned to pre-debate levels. These are decidedly negative, of course; but the additional negative impact of the debate on Twitter discussion of Trump may have been short-lived. The same is not true for Biden. The hours surrounding the debate saw a marked shift in expressed stance towards Biden, from by-minute averages that were anti-Biden to clearly pro-Biden. The shift is evident only 10 minutes into the 90-minute debate, and durable for the hour following the debate as well.
Twitter is by no means an accurate representation of public opinion more broadly – we must be sure to interpret these results as indicating the debate impact on Twitter discussion, not the public writ large. That said, where Twitter is concerned it seems relatively clear that Biden ‘won’ the debate.
Information about the analysis:
This analysis was conducted using approximately 1.3 million tweets that contained at least of the debate hashtags. We collect posts using the Twitter Streaming API. We use the core debate hashtags for this analysis, e.g. #debates2020, #presidentialdebate2020, etc. We determine if the tweet showed support, opposition, or neither for each candidate. For each minute, we compute an aggregate stance score as follows: Stance Score = (# Support – # Oppose) / (# of tweets that minute having a stance). To determine the stance itself, we trained a BERT fined tune model with a single layer on 5 million posts related to election 2020. We also had three people label 1000 tweets with stance to further improve our model.
Political science has been enriched by the use of social media data. However, automated text-based classification systems often do not capture image content. Since images provide rich context and information in many tweets, these classifiers do not capture the full meaning of the tweet. In a new paper presented at the 2020 Annual Meeting of the American Political Science Association (APSA), Patrick Wu, Alejandro Pineda, and Walter Mebane propose a new approach for analyzing Twitter data using a joint image-text classifier.
Human coders of social media data are able to observe both the text of a tweet and an attached image to determine the full meaning of an election incident being described. For example, the authors show the image and tweet below.
If only the text is considered, “Early voting lines in Palm Beach County, Florida #iReport #vote #Florida @CNN”, a reader would not be able to tell that the line was long. Conversely, if the image is considered separately from the text, the viewer would not know that it pictured a polling place. It’s only when the text and image are combined that the message becomes clear.
A new framework called Multimodal Representations Using Modality Translation (MARMOT) is designed to improve data labeling for research on social media content. MARMOT uses modality translation to generate captions of the images in the data, then uses a model to learn the patterns between the text features, the image caption features, and the image features. This is an important methodological contribution because modality translation replaces more resource-intensive processes and allows the model to learn directly from the data, rather than on a separate dataset. MARMOT is also able to process observations that are missing either images or text.
MARMOT was applied to two datasets. The first dataset contained tweets reporting election incidents during the 2016 U.S. general election, originally published in “Observing Election Incidents in the United States via Twitter: Does Who Observes Matter?” The tweets in this dataset report some kind of election incident. All of the tweets contain text, and about a third of them contain images. MARMOT performed better at classifying the tweets than the text-only classifier used in the original study.
In order to test MARMOT against a dataset containing images for every observation, the authors used the Hateful Memes dataset released by Facebook to assess whether a meme is hateful or not. In this case, a multimodal model is useful because it is possible for neither the text nor the image to be hateful, but the combination of the two may create a hateful message. In this application, MARMOT outperformed other multimodal classifiers in terms of accuracy.
As more and more political scientists use data from social media in their research, classifiers will have to become more sophisticated to capture all of the nuance and meaning that can be packed into small parcels of text and images. The authors plan to continue refining MARMOT, and expand the models to accommodate additional elements such as video, geographical information, and time of posting.
“Not which ones, but how many” is a phrase used in list experiments instruction, where researchers instruct participants, “After I read all four (five) statements, just tell me how many of them upset you. I don’t want to know which ones, just how many.” In retrospect, I was surprised to see that this phrase encapsulates not only the key research idea, but also my fieldwork adventure: not which plans could go awry, but how many. The fieldwork experience could be frustrating at times, but it has led me to uncharted terrain and brought insights into the research contexts. The valuable exposure would not have been possible without support from the Roy Pierce Award and guidance from Professor Yuki Shiraito.
Research that I conducted with Yuki Shiraito explores the effect of behavior on political attitudes in authoritarian contexts to answer the question: does voting for autocracy reinforce individual regime support? To answer this question, two conditions need to be true. First, people need to honestly report their level of support before- and after- voting in authoritarian elections. Second, voting behavior needs to be random. Neither situation is probable in illiberal autocracies. Our project addresses these methodological challenges by conducting a field experiment that combines a list experiment and a randomized encouragement design in China.
In this study, list experiments are used instead of direct questions to measure the respondents’ attitudes towards the regime in the pre- and post-election surveys. The list experiment is a survey technique to mitigate preference falsification by respondents. Although the true preference of individual respondents will be hidden, the technique allows us to identify the average level of support for the regime within a group of respondents. In addition, we employ a randomized encouragement design where get-out-the-vote messages are randomly assigned, which help us estimate the average causal effect of a treatment. For effect moderated by prior support for the regime, we estimate the probability of the prior support using individual characteristics and then estimate the effect for the prior supporters via a latent variable model.
While the theoretical part of the project went smoothly and the simulation results were promising, the complication of fieldwork exceeded my expectation. For the list experiment survey, the usually reticent respondents started asking questions about the list questions immediately after the questionnaires were distributed. Their queries took the form of “I am upset by option 1, 2, and 4, so what number should I write down here?” This was not supposed to happen. List experiments are developed to conceal individual respondents’ answers from researchers. By replacing the questions of “which ones” with the question of “how many,” respondents’ true preference is not directly observable, which makes it easier for them to answer sensitive questions honestly. Respondents’ eagerness to tell me their options directly defeats the purpose of this design. Later I learned from other researchers that the problem I encountered was common in list experiment implementation regardless of research contexts and types of respondents.
The rationale behind respondents’ desire to share their individual options despite being given a chance to hide them is thought-provoking. Is it because of the cognitive burden of answering a list question, which is not a familiar type of questions to respondents? Or is it because the sensitive items, despite careful construction, raise the alarm? Respondents are eager to specify their stance on each option and identify themselves as regime supporters: they do not leave any room for misinterpretation. To ease the potential cognitive burden, we will try a new way to implement the list experiment in a similar project on preference falsification in Japan. We are looking forward to seeing if it improves respondents’ comprehension of the list question setup. The second explanation is more concerning, however. It suggests the scope condition of list experiments as a valid tool to elicit truthful answers from respondents. Other more implicit tools, such as endorsement experiments, may be appropriate in those contexts to gauge respondent’s preference.
Besides the intricacies of the list experiment, carrying out encouragement design on the ground is challenging. We had to modify the behavioral intervention to adapt needs from our local collaborators, and the realized sample size was only a fraction of the negotiated size initially. Despite the compromises, the implementation is imbued with uncertainty: meetings were postponed or rescheduled last minutes, instructions from local partners are sometimes inconsistent and conflictual. The frustration was certainly real. But the pain makes me cognizant of judgment calls researchers have to make in the backstage. The amount of effort required to produce reliable data is admirable. And as a consumer of data, I should always interpret data with great caution.
While the pilot study does not lead to a significant finding directly, the research experience and the methods we developed have informed the design of a larger project that we are currently doing in Japan.
I always thought of doing research as establishing a series of logical steps between a question and an answer. Before I departed for the pilot study, I made a detailed timeline for the project with color-coded tasks, flourish-shaped arrows pointing at milestones of the upcoming fieldwork. When I presented this plan to Professor Shiraito, he smiled and told me that “when doing research, it is generally helpful to think of the world in two ways: the ideal world and the real world. You should be prepared for both.” Wise words. Because of this, I am grateful for the Roy Pierce Award for offering the opportunity to catch a glimpse of the real world. And I am indebted to Professor Shiraito for helping me see the potential of attaining the ideal world with intelligence and appropriate tools.