Saturday, February 9, 2013

Lies, Damn Lies and Statistics ... in the age of Big Data

Interesting article linked below that touches on a concept associated with Big Data that's been bothering me.  The article essentially argues that huge data sets lend themselves to malfeasance by researchers because of the ease with which correlations can be identified combined with the spurious nature of the vast majority of the correlations in a large data set.  I liked the following analogy:

Just like bankers who own a free option — where they make the profits and transfer losses to others – researchers have the ability to pick whatever statistics confirm their beliefs (or show good results) … and then ditch the rest. 
Big-data researchers have the option to stop doing their research once they have the right result. In options language: The researcher gets the “upside” and truth gets the “downside.” It makes him antifragile, that is, capable of benefiting from complexity and uncertainty — and at the expense of others.
As a fan of Big Data and Googlizing facilities and infrastructure I view this as a cautonary tale.

Welcome to the Collaborative Revolution!

James L. Salmon, Esq.
Collaborative Construction
300 Pike Street
Cincinnati, Ohio 45202 Summary of Services and James L. Salmon's CV

Office 513-721-5672
Fax 513-562-4388
Cell 512-630-4446

Collaborative Construction Website

No comments: