Post developed by Catherine Allen-West, Charles Crabtree and Andrew Kerner

ICYMI (In Case You Missed It), the following work was presented at the 2017 Annual Meeting of the American Political Science Association (APSA).  The presentation, titled “The IMF and the Political Economy of GDP Data Production” was a part of the session “Economic Growth and Contraction: Causes, Consequences, and Measurement” on Sunday, September 3, 2017.

Political economists often theorize about relationships between politics and macroeconomics in the developing world; specifically, which political or social structures promote economic growth, or wealth, or economic openness, and conversely, how those economic outcomes affect politics. Answering these questions often requires some reference to macroeconomic statistics. However, recent work has questioned these data’s accuracy and objectivity. An under-explored aspect of these data’s limitations is their instability over time. Macroeconomic data is frequently revised ex post, or after the fact, and as such one could ask the same question of (ostensibly) the same data, and get different answers depending on when the question was asked.

We set out to explore the political economy of data production by examining a newly available dataset of ex post revisions to World Development Indicators (WDI) data.[1]  Ex post revisions occur when newly available information changes national statistical offices’ beliefs about the nature of the economy. Those revisions extend into the past, effectively rewriting history and, in the process, providing a reasonable proxy for the inaccuracy of the initial reports. These revisions affect a wide swath of data, but we focus on Gross Domestic Product (GDP) and GDP-derived measures, like GDP per capita and GDP growth. GDP revisions are common—most GDP data available for download at the WDI are different now than they were at the time of its initial release. Normally these changes are subtle; other times they are substantial enough to condemn prior data releases as misleading.

We use these revisions to answer two related questions. First, how sensitive are political-economy relationships to GDP revisions? Should researchers worry about revisions-driven instability in the state of political-economic knowledge? We show that they should. To illustrate, we subject a simple, bivariate statistical relationships between democracy and growth to re-estimation using alternative versions of the “same” data. The democracy-growth relationship has been a topic of sufficient interest in economics and political science that instability in this relationship should give us reason for pause. Seen in this light our estimates are worrisome. As we show in Figure 1 below, our estimates are unstable across different “observation years” and further, they are unstable in ways that suggest that initial estimates were biased. Rather than simply a diminution of standard errors as more heavily revised data are introduced (which is what we would expect to see if revisions simply reduced random “noise in the data”), the estimated coefficients for Democracy change substantially across models estimated with different revisions of the same country-year GDP growth data.

Figure 1: GDP Growth ~ Democracy

Note: Figure 1 displays the relationship between GDP Growth and Democracy using the results from 21 different regression models. Plotted points represent parameter estimates, thick bars represent 90 percent confidence intervals, and thin bars represent 95 percent confidence intervals. Each point is labeled with the revision year used. The left side of the plot contains results from models estimated using the 2000-2004 data series, while the right side of the plot contains results from models estimated using the 1995-1999 data series. See paper for more details.

This finding anticipates our second question: Given the likelihood that GDP revision are non-random, what accounts for ex post revisions? What does the “political economy” of revisions look like? We show using Kolmogorov-Smirnov tests (see Figure 2) and random forest models (see Figure 3) that the International Monetary Fund (IMF) influences the magnitude of revisions for GDP and GDP-related measures. That is not entirely surprising. Our suspicion that the IMF would have such an effect is a straightforward recognition of its well-publicized efforts to provide financial and human resources to the national statistical offices of the countries in which it works. What we have “uncovered” in this exercise is simply one consequence of the IMF doing precisely what it has publically said it is doing. But this finding’s (retrospective) obviousness does not diminish its importance. Consider the empirical challenges that this presents. Political economists often ask if the IMF affects the way economies functions, but the IMF’s independent effect on the way economies are measured substantially complicates our ability to know if it does. And it doesn’t just complicate our ability to know if the IMF’s policies affect the economy, it complicates our ability to know if anything correlated with IMF participation affects the economy. Many important things correlate with IMF participation, including, for example, democracy, a country’s relationship with the UN, and whether or not a country is an ally of the United States.

Figure 2: Distributions of GDP Growth Changes

Note: Figure 2 presents compares the distributions of GDP growth revisions for years with and without IMF programs. The y-axis indicates the height of the density function and the x-axis indicates the magnitude of GDP growth revisions in percentages points. The solid green line denotes country years with an IMF program, while the dashed black line denotes countries years without a program. See paper for more details.

Figure 3: Predictors of GDP Growth Revisions

Note: Figure 3 presents the results from a random forest model that examines the predictors of GDP growth revisions. The vertical axis ranks variables according to their importance for predicting GDP Growth Changes. The horizontal axis displays estimates of permutation accuracy for each variable, calculated as the difference in mean squared error between a model that is fitted using the observed values for a measure and a model that is fitted using random (but realistic) values for the same measure. This measure is then scaled to represent the percentage increase in mean square error caused by permuting the values of the variable. Positive values indicate that the variables increase the predictive performance of the model, while negative values indicate that the variables decrease the predictive performance of the variables. See paper for more details.

Of course, politics likely affects the way the economy is measured in a variety of ways that have nothing to do with the IMF. Our random forest analysis suggests that democracy might also have an effect, for example, as might public sector corruption, and it is not hard to tell a plausible post hoc story for why that might be. But our aim is not to provide a comprehensive picture of the political economy of data production, but simply to show that it exists, and that it exists in a manner that should alert us to its importance. Taking seriously the political provenance of ostensibly apolitical data is an important (and, we believe, interesting) step towards refining the state of political economy knowledge.


[1] The raw data used in this paper are available at http://databank.worldbank.org/data/reports.aspx?source=WDI-Archives. To facilitate researcher use of this data, we will make it available in an R package, revisions. This package will contain long- and wide-format data sets.

Andrew Kerner is an Assistant Professor in the Political Science department at the University of Michigan, and a faculty associate at the Center For Political Studies.

Charles Crabtree is a PhD student in the Department of Political Science at the University of Michigan.