The “Science” and “Management” of Data Analysis

Hierarchy and branches of Statistical Science

The phrases “Science” and “Management” of data analysis were introduced by Manny Parzen (2001) while discussing Leo Breiman’s Paper on “Statistical Modeling: The Two Cultures,” where he pointed out:

Management seeks profit, practical answers (predictions) useful for decision making in the short run. Science seeks truth, fundamental knowledge about nature which provides understanding and control in the long run.

Management = Algorithm, prediction and inference is undoubtedly the most useful and “sexy” part of Statistics. Over the past two decades, there have been tremendous advancements made in this front, leading to a growing number of literature and excellent textbooks like Hastie, Tibshirani, and Friedman (2009) and more recently Efron and Hastie (2016).

Nevertheless, we surely all agree that algorithms do not arise in a vacuum and our job as a Statistical scientist should be better than just finding another “gut” algorithm. It has long been observed that elegant statistical learning methods can be often derived from something more fundamental. This forces us to think about the guiding principles for designing (wholesale) algorithms. The “Science” of data analysis = Algorithm discovery engine (Algorithm of Algorithms). Finding such a consistent framework of Statistical Science (from which one might be able to systematically derive a wide range of working algorithms) promises to not be trivial.

Above all, I strongly believe the time has come to switch our focus from “management” to the heart of the matter: how can we create an inclusive and coherent framework of data analysis (to accelerate the innovation of new versatile algorithms)–“A place for everything, and everything in its place”– encoding the fundamental laws of numbers. In this (difficult yet rewarding) journey, we have to remind ourselves constantly the enlightening piece of advice from Murray Gell-Mann (2005):

We have to get rid of the idea that careful study of a problem in some NARROW range of issues is the only kind of work to be taken seriously, while INTEGRATIVE thinking is relegated to cocktail party conversation


Leave a Reply