# Beware of Retail Algorithm Building Culture, Yet Another Example

Beware of “button effect” (as argued in here): How the growing culture of developing RETAIL statistical algorithms and softwares can prevent data scientists “looking at the data” leading to useless answers. The importance of Nonparametric Exploratory Statistical Practice in the age of developing easy-to-use retail statistical softwares is illustrated using simple examples: http://s.hbr.org/1qVt3Fj.

# BIG data, BIG opportunity, BIG Tent

Question: How can we create a BIG TENT to increase visibility and IMPACT of our profession?

There are two possibilities (applied inter-disciplinary and core research), and both efforts are important (and should be balanced):

(a)  Applying retail algorithms working as multidisciplinary teams (as described in PDF):

BIG data brings enormous interdisciplinary opportunities for statisticians. This helps to create a living for statisticians by applying (traditional) statistical tools + problem specific bells and whistles.

(b) Developing wholesale algorithms  that can be translated into curriculum.

Along with Retail domain-specific problem solving (and retail paper publications), the academic statisticians need to develop wholesale algorithms (multidisciplinary utility: like AIC, Bootstrap, Knn, RKHS, SVM, Spline, RF, Lasso etc.) that other disciplines can routinely use for their data-driven (exploratory not confirmatory) research. I fear we might lose our unique identity (and spirit of our discipline—the science of learning from data) if we focus too much on solving `isolated’ problems (while busy making a living for ourselves using traditional tools + some twists and turns); otherwise, we will produce skilled biologists or engineers, not statisticians.  We need to find a balance between these two broad approaches, which have the same goal- –to advance the frontiers of statistics.

# Algorithms: Retail and Wholesale

Broadly speaking, there are two kinds of algorithms (based on their utility):

• Retail: Solving real scientific problems one at a time for clients/collaborators.
• Wholesale: Theory and algorithms applicable simultaneously for many clients and problems. An example [PDF].

This paper [PDF] argues that academic statisticians should aim to develop “wholesale” algorithms.

Our research on United Statistical Algorithms motivated by the following question:

“How to develop a Systematic Data Modeling Strategy? How to design Flexible and Reusable algorithms based on General Theory that can be adapted to solve specific Practical Problems”