Difficulties in identifying problems have delayed statistics far more than difficulties in solving problems” — John W. Tukey

My Research Aim & Scope in one line:  “United Statistical Science  =  LP Nonparametric Harmonic Analysis.”      

Nonparametric Data Science: 

How can we develop a consistent and unified framework of data analysis (the foundation of data science) that would reveal the interconnectedness among different branches of statistics? This question is the driving force behind my research program. I have been developing one such candidate theory that will pave the way for a progressive unification of fundamental statistical learning tools. Our theory has given birth to a new and exciting discipline for 21st-century statistics, called “Nonparametric Data Science,” which does not yet have a large literature, and is slowly gaining ground.

Assuming that a unified foundation is inevitable, what will it be? I think the general refusal in our field to strive for a unified perspective has been the single biggest impediment to its advancement” –Jim Berger (2000).

We seek to focus on one important field of statistics at a time with a goal to simplify, unify and generalize them using our “Nonparametric Data Science” theory and tools. Under this new framework, significant number of statistical problems have been tackled to date, including: statistical spectral analysis of graphs (Mukhopadhyay, 2017d), large-scale mode identification for discovery science (Mukhopadhyay, 2017a), unified multiple testing (Mukhopadhyay, 2016), nonparametric copula dependence modeling (Parzen and Mukhopadhyay, 2013b), non-linear time series modeling (Mukhopadhyay and Parzen, 2017; Mukhopadhyay and Nandi (2017), high-dimensional data modeling (Mukhopadhyay and Wang, 2017c) and nonparametric distributed learning (Bruce et al., 2016; Mukhopadhyay, 2017b). These findings strongly indicate that the theory of “United Statistical Algorithms” may be just around the corner.

Throughout, my goal has been to judiciously balance both the discipline of statistics (developing a unified general theory of statistics) and profession of statistics (applied data analysis, Interdisciplinary collaboration and consultation) to become a “whole” 21st century statistician. I am also designing an educational program called `Nonparametric Data Science’ — a series of well-connected training modules with the goal of broadly preparing students and applied researchers [PDF], which will be disseminated freely online.

Recent Projects:

  1. Mixed Data Science [Details]
  2. Foundations of Graph Data Science [Details]
  3. Bayes via Goodness-of-fit [Details]
  4. United Multiple Testing [Details]
  5. Statistics, Big Data, and Parallelism: Towards A Unified Framework [Details]