Saturday, February 14, 2004

The Perils of Data Mining

Over at Crooked Timber, John Quiggin has a very interesting post up about the pitfalls of data mining. This is well worth reading, if only as something to keep in mind on those occasions when some person or other threatens to "run the regressions" to "prove" some point.

There is a lot more to statistics than many people who suppose themselves sophisticated users of its techniques* realize, and economists (and econometricians in particular) are at the forefront of wrestling with the problems that the field faces. In fact, I'd say that anyone looking for a thorough understanding of statistical techniques and their limitations is better off reading a book like William H. Greene's Econometric Analysis in combination with Fumio Hayashi's Econometrics than the sort of pap usually ladled out in the biological and social sciences.

*In particular, those who work in fields like psychology, with practitioners of psychometrics being particularly guilty.