Post developed by Katie Brown and Josh Pasek.

Photo credit: ThinkStock

Photo credit: ThinkStock

Have you noticed how the products you look at online seem to follow you from site to site and the coupons you receive in the mail sometimes seem a little too targeted? This happens because a set of companies are gathering information about Americans and merging them together into vast marketing databases. In addition to creating awkwardly personal advertisements, these data might be useful for researchers who want to know about the kinds of people who are and are not responding to public opinion surveys.

But before marketing data are incorporated into social science analyses, it is important to know how accurate the information actually is. Indeed, there are many concerns about consumer data. It could be out of date, incomplete, linked to the wrong person, or simply false for a variety of reasons. If we don’t know when marketing data are accurate, it is going to be difficult to figure out how these data can be used.

This is where the work of Josh Pasek, Center for Political Studies (CPS) Faculty Associate and Assistant Professor of Communication, comes in. Pasek, along with S. Mo Jang, Curtiss L. Cobb, J. Michael Dennis, and Charles DiSogra, have a forthcoming paper in Public Opinion Quarterly about the utility of marketing data. With Gfk Custom Research, 25,000 random addresses were selected, with about 10% of those joining the study. The marketing data available on these individuals was then matched against data collected as part of the study.

Interestingly, many variables showed large discrepancies between the two sources. Incomes mismatched by more than $10,000 for 43% of participants, while education level differed in at least two measures for 25%. Even the number of people living at the address differed by two or more in 35% of cases. Pasek and colleagues also investigate missing data with three different analyses. Ultimately, they find that the amount of data missing from consumer data is vast.

But at the same time, the consumer data performed better than chance in predicting actual data for all variables. This may make them useful for marketing purposes, but Pasek cautions that social scientific applications could be problematic. As Pasek says, “The bottom line is that these data are not consistently accurate. Although they may be great for targeting people who are more likely to buy a particular brand of shoes, our results suggest that marketing databases don’t have the precision for many research purposes.”