The Political Economy of Data Production

Post developed by Catherine Allen-West, Charles Crabtree and Andrew Kerner

ICYMI (In Case You Missed It), the following work was presented at the 2017 Annual Meeting of the American Political Science Association (APSA).  The presentation, titled “The IMF and the Political Economy of GDP Data Production” was a part of the session “Economic Growth and Contraction: Causes, Consequences, and Measurement” on Sunday, September 3, 2017.

Political economists often theorize about relationships between politics and macroeconomics in the developing world; specifically, which political or social structures promote economic growth, or wealth, or economic openness, and conversely, how those economic outcomes affect politics. Answering these questions often requires some reference to macroeconomic statistics. However, recent work has questioned these data’s accuracy and objectivity. An under-explored aspect of these data’s limitations is their instability over time. Macroeconomic data is frequently revised ex post, or after the fact, and as such one could ask the same question of (ostensibly) the same data, and get different answers depending on when the question was asked.

We set out to explore the political economy of data production by examining a newly available dataset of ex post revisions to World Development Indicators (WDI) data.[1]  Ex post revisions occur when newly available information changes national statistical offices’ beliefs about the nature of the economy. Those revisions extend into the past, effectively rewriting history and, in the process, providing a reasonable proxy for the inaccuracy of the initial reports. These revisions affect a wide swath of data, but we focus on Gross Domestic Product (GDP) and GDP-derived measures, like GDP per capita and GDP growth. GDP revisions are common—most GDP data available for download at the WDI are different now than they were at the time of its initial release. Normally these changes are subtle; other times they are substantial enough to condemn prior data releases as misleading.

We use these revisions to answer two related questions. First, how sensitive are political-economy relationships to GDP revisions? Should researchers worry about revisions-driven instability in the state of political-economic knowledge? We show that they should. To illustrate, we subject a simple, bivariate statistical relationships between democracy and growth to re-estimation using alternative versions of the “same” data. The democracy-growth relationship has been a topic of sufficient interest in economics and political science that instability in this relationship should give us reason for pause. Seen in this light our estimates are worrisome. As we show in Figure 1 below, our estimates are unstable across different “observation years” and further, they are unstable in ways that suggest that initial estimates were biased. Rather than simply a diminution of standard errors as more heavily revised data are introduced (which is what we would expect to see if revisions simply reduced random “noise in the data”), the estimated coefficients for Democracy change substantially across models estimated with different revisions of the same country-year GDP growth data.

Figure 1: GDP Growth ~ Democracy

Note: Figure 1 displays the relationship between GDP Growth and Democracy using the results from 21 different regression models. Plotted points represent parameter estimates, thick bars represent 90 percent confidence intervals, and thin bars represent 95 percent confidence intervals. Each point is labeled with the revision year used. The left side of the plot contains results from models estimated using the 2000-2004 data series, while the right side of the plot contains results from models estimated using the 1995-1999 data series. See paper for more details.

This finding anticipates our second question: Given the likelihood that GDP revision are non-random, what accounts for ex post revisions? What does the “political economy” of revisions look like? We show using Kolmogorov-Smirnov tests (see Figure 2) and random forest models (see Figure 3) that the International Monetary Fund (IMF) influences the magnitude of revisions for GDP and GDP-related measures. That is not entirely surprising. Our suspicion that the IMF would have such an effect is a straightforward recognition of its well-publicized efforts to provide financial and human resources to the national statistical offices of the countries in which it works. What we have “uncovered” in this exercise is simply one consequence of the IMF doing precisely what it has publically said it is doing. But this finding’s (retrospective) obviousness does not diminish its importance. Consider the empirical challenges that this presents. Political economists often ask if the IMF affects the way economies functions, but the IMF’s independent effect on the way economies are measured substantially complicates our ability to know if it does. And it doesn’t just complicate our ability to know if the IMF’s policies affect the economy, it complicates our ability to know if anything correlated with IMF participation affects the economy. Many important things correlate with IMF participation, including, for example, democracy, a country’s relationship with the UN, and whether or not a country is an ally of the United States.

Figure 2: Distributions of GDP Growth Changes

Note: Figure 2 presents compares the distributions of GDP growth revisions for years with and without IMF programs. The y-axis indicates the height of the density function and the x-axis indicates the magnitude of GDP growth revisions in percentages points. The solid green line denotes country years with an IMF program, while the dashed black line denotes countries years without a program. See paper for more details.

Figure 3: Predictors of GDP Growth Revisions

Note: Figure 3 presents the results from a random forest model that examines the predictors of GDP growth revisions. The vertical axis ranks variables according to their importance for predicting GDP Growth Changes. The horizontal axis displays estimates of permutation accuracy for each variable, calculated as the difference in mean squared error between a model that is fitted using the observed values for a measure and a model that is fitted using random (but realistic) values for the same measure. This measure is then scaled to represent the percentage increase in mean square error caused by permuting the values of the variable. Positive values indicate that the variables increase the predictive performance of the model, while negative values indicate that the variables decrease the predictive performance of the variables. See paper for more details.

Of course, politics likely affects the way the economy is measured in a variety of ways that have nothing to do with the IMF. Our random forest analysis suggests that democracy might also have an effect, for example, as might public sector corruption, and it is not hard to tell a plausible post hoc story for why that might be. But our aim is not to provide a comprehensive picture of the political economy of data production, but simply to show that it exists, and that it exists in a manner that should alert us to its importance. Taking seriously the political provenance of ostensibly apolitical data is an important (and, we believe, interesting) step towards refining the state of political economy knowledge.


[1] The raw data used in this paper are available at http://databank.worldbank.org/data/reports.aspx?source=WDI-Archives. To facilitate researcher use of this data, we will make it available in an R package, revisions. This package will contain long- and wide-format data sets.

Andrew Kerner is an Assistant Professor in the Political Science department at the University of Michigan, and a faculty associate at the Center For Political Studies.

Charles Crabtree is a PhD student in the Department of Political Science at the University of Michigan.

The Spread of Mass Surveillance, 1995 to Present

ICYMI (In Case You Missed It), the following work was presented at the 2017 Annual Meeting of the American Political Science Association (APSA).  The presentation, titled “Big Data Innovation Transfer and Governance in Emerging High Technology States”  was a part of the session “The Role of Business in Information Technology and Politics” on Friday September 1, 2017. 

Post developed by Nadiya Kostyuk and Muzammil M. Hussain

On August 24, 2017, India’s highest court ruled that citizens have a fundamental right to privacy. Such a ruling may serve to slowdown the government’s deployment of the Aadhaar national ID program, a robust relational database connecting each of India’s 1.3+ billion citizens with their unique 12-digit identity aimed at centralize their physiological, demographic, and digital data shadows — minute pieces of data created when an individual sends an email, updates a social media profile, swipes a credit card, uses an ATM, etc. While the government has presented the Aadhaar system as an improved channel to provide social security benefits for its nationals, India’s civil society organizations have protested it as a means of furthering government surveillance. India’s trajectory in ambitiously modernizing its high-tech toolkit for governance represents a rapidly spreading trend in the contemporary world system of 190+ nations.

Take China as an other example.  China has recently mobilized its government bureaucracies to establish the worlds’ first ever, and largest, national Social Credit System covering nearly 1.4+ billion Chinese citizens. By 2020, China’s citizen management system will include each Chinese national’s financial history, online comments about government, and even traffic violations to rank their ‘trustworthiness.’ Like India’s, these unique ‘social credit’ ratings will reward and punish citizens for their behavioral allegiance with the regime’s goals by scientifically allowing the state to operationalize its vision of a “harmonious socialist society.”

Yet, the implementation of state-sponsored and ‘big data’-enabled surveillance systems to address the operational demands of governance is not limited just to the world’s largest democratic and authoritarian states. This summer, at the annual meetings of the International Communication Association (May 2017, San Diego) and the American Political Science Association (August 2017, San Francisco), the project on Big Data Innovation & Governance (BigDIG) presented findings from the first event-catalogued case-history analysis of 306 cases of mass surveillance systems that currently exist across 139 nation-states in the world system (Kostyuk, Chen, Das, Liang and Hussain, 2017). After identifying the ‘known universe’ of these population-wide data infrastructures that now shape the evolving relationships between citizens and state powers, our investigation paid particular attention to how state-sponsored mass surveillance systems have spread through the world-system, since 1995.

By closely investigating all known cases of state-backed cross-sector surveillance collaborations, our findings demonstrate that the deployment of mass surveillance systems by states has been globally increasing throughout the last twenty years (Figure 1). More importantly, from 2006-2010 to present, states have uniformly doubled their surveillance investments compared with the previous decade.

In addition to unpacking the funding and diffusion of mass surveillance systems, we are also addressing the following questions: Which stakeholders have most prominently expressed support for, benefited from, or opposed these systems, and why? What have been the comparative societal responses to the normalization of these systems for the purposes of population management in recent decades?

The observed cases in our study differ in scope and impact.

Why do stable democracies and autocracies operate similarly, while developing and emerging democracies operate differently? Access to and organization of material, financial, and technocratic resources may provide some context.

While nations worldwide have spent at least $27.1 billion USD (or $7 per individual) to surveil 4.138 billion individuals (i.e., 73 percent of the world population), stable autocracies are the highest per-capita spenders on mass surveillance. In total, authoritarian regimes have spent $10.967 billion USD to surveil 81 percent of their populations (0.1 billion individuals), even though this sub-set of states tends to have the lowest levels of high-technology capabilities. Stable autocracies have also invested 11-fold more than any other regime-type, by spending $110 USD per individual surveilled, followed second-highest by advanced democracies who have invested $8.909 billion USD in total ($11 USD per individual) covering 0.812 billion individuals (74 percent of their population). In contrast to high-spending dictatorships and democracies, developing and emerging democracies have invested $4.784 billion USD (or $1-2 per individual) for tracking 2.875 billion people (72 percent of their population).

It is possible that in a hyper-globalizing environment increasingly characterized by non-state economic (e.g., multi-national corporations) and political (e.g., transnational terror organizations) activity, nation-states have both learned from and mimicked each other’s investments in mass surveillance as an increasingly central activity in exercising power over their polities and jurisdictions. It is also likely that the technological revolution in digitally-enabled big data and cloud computing capabilities as well as the ubiquitous digital wiring of global populations (through mobile telephony and digital communication) have technically enabled states to access and organize population-wide data on their citizens in ways not possible in previous eras. Regardless of the impetuses for increases in mass surveillance efforts, our research aims to provide empirical support to advance theory and guide policy on balancing security needs and privacy concerns at a time where many governments are ambitiously upgrading their governance systems with unbridled hi-tech capabilities.

 

Inequality is Always in the Room: Language and Power in Deliberative Democracy

ICYMI (In Case You Missed It), the following work was presented at the 2017 Annual Meeting of the American Political Science Association (APSA).  The presentation, titled “Inequality is Always in the Room: Language and Power in Deliberative Democracy” was a part of the session “Is Deliberation War by Other Means?” on Thursday, August 31, 2017. 

Posted by Catherine Allen-West


In a new paper, presented at the 2017 APSA meeting, Arthur Lupia, University of Michigan, and Anne Norton, University of Pennsylvania, explore the effectiveness of deliberative democracy by examining the  foundational communicative acts that take place during deliberation.

Read the full paper here: http://www.mitpressjournals.org/doi/abs/10.1162/DAED_a_00447