Replication is hard...

...particularly when the data keeps changing. The ability to replicate results is essential to the scientific enterprise. One of the great benefits of experimental research is that, in principle, we can repeat the experiment and generate a fresh set of data. While this is impossible for many questions in social science, at a minimum one would hope that we could replicate our original results using the same dataset. As many students in Gov 2001 can tell you, however, social science often fails to clear even that low bar.

Of course, even this type of replication is impossible if someone else has changed the dataset since the original analysis was conducted. But that would never happen, right? Maybe not. In an interesting paper, Alexander Ljungqvist, Christopher Malloy, and Felicia Marston take a look at the I/B/E/S dataset of analyst stock recommendations "made" during the period from 1993 to 2000. Here is what they found:

Comparing two snapshots of the entire historical I/B/E/S database of research analyst stock recommendations, taken in 2002 and 2004 but each covering the same time period 1993-2002, we identify tens of thousands of changes which collectively call into question the principle of replicability of empirical research. The changes are of four types: 1) The non-random removal of 19,904 analyst names from historic recommendations (“anonymizations”); 2) the addition of 19,204 new records that were not previously part of the database; 3) the removal of 4,923 records that had been in the data; and 4) alterations to 10,698 historical recommendation levels. In total, we document 54,729 ex post changes to a database originally containing 280,463 observations.

Our main contribution is to document the characteristics and effects of these pervasive changes. The academic literature on analyst stock recommendations, using I/B/E/S data, is truly vast: As of December 12, 2006, Google Scholar identifies 565 articles and working papers using the keywords “I/B/E/S”, “analysts”, and “recommendations”. Given this keen academic interest, as well as the intense scrutiny that research analysts face in the marketplace and the growing popularity of trading strategies based on analyst output, changes to the historical I/B/E/S database are of obvious interest to academics and practitioners alike. We demonstrate that the changes have a significant effect on the distribution of recommendations, both overall and for individual stocks and individual brokerage firms. Equally important, they affect trading signal classifications, back-testing inferences, the track records of individual analysts, and models of analysts’ career outcomes in the years since the changes occurred. Regrettably, none of the changes can easily be “undone” by researchers, which makes replicating extant studies difficult. Our findings thus have potentially important ramifications for existing and future empirical studies of equity analysts.

Not surprisingly, they find that these changes typically make it appear as if analysts were (a) more cautious and (b) more accurate in their predictions. The clear implication from the paper is that analysts and their employers had a vested interest in selectively editing this particular dataset; while I doubt that anyone cares enough about most questions in political science to do something similar, it is an important cautionary tale. The rest of their paper, "Rewriting History," is available from SSRN. (Hat tip: Big Picture)

Posted by Mike Kellermann at 4:01 PM