Deflating Big Data Hubris

Written Sep 24, 2014

The highs and lows of Google Flu trends, “once a poster child for the power of big-data analysis,” serve as a case study for David Lazer, Ryan Kennedy, Gary King, et al, in their Science Magazine* article “The Parable of Google Flu: Traps in Big Data Analysis.”

In the age of “big data hubris,” here are two takeaways from their analysis:

  • Don’t neglect old lessons learned, but combine new methods with previous wisdom:

“Big data hubris” is the often implicit assumption that big data are a substitute for, rather than a supplement to, traditional data collection and analysis.

  • Aim for transparency and replicability when publishing your models:

“Science is a cumulative endeavor, and to stand on the shoulders of giants requires that scientists be able to continually assess work on which they are building […] Accumulation of knowledge requires fuel in the form of data.”

*References

The article is available here:
“The Parable of Google Flu: Traps in Big Data Analysis” (Science, 14 March 2014)

The New York Times’ Bits Blog post on Google Flu trends:
“Google Flu Trends: The Limits of Big Data”