21st Century Statistics Educational Challenge.

There is a need to train students to use deep, broad, and creative statistical thinking instead of just training them in algorithms,” by Kettenring et al. (2015) on “Challenges and Opportunities for Statistics in the Next 25 Years,” The American Statistician, 69(2), 86-90. (Special issue on ASA’s 175th Anniversary).

How should we prepare a data science workforce of trained next-generation statisticians?  What should students learn to glimpse the frontiers of statistical theory and methods leading to big data science?

I strongly believe statisticians in the 21st century (if they take advantage of their opportunities) can look forward to very bright futures for the discipline and profession of statistics by solving the important problem of teaching thousands of statistical data scientist aspirants. Developing such a comprehensive training curriculum covering the fundamental methods of statistical learning is challenging, and requires a new approach.

For a full report on my views on “Statistics Educational Challenge in the 21st Century” see [PDF]

An Approach: Integrating Research with Educational Activities.

The current trend based on the “Las Vegas Approach” or what I call the “Cut & Paste” approach, in which a course is nothing but a collection of disconnected topics with cookbook recipes,  is destined to fail. We have to invent strategies (based possibly on new tools and concepts) that differ from traditional courses by teaching methods for simple data in ways that extend to complex big data (similar to the goal of teaching finite-dimensional math in notation that extend to infinite-dimensional Hilbert space). I strongly feel this initiative is extremely important otherwise statistics will be viewed as the “least important part of data science.”  There is an urgent need to develop such a framework to synthesize and apply the past half-century of key methodological progress in a systematic way that is also applicable for modern “Big Data,”  which does NOT exist today.

I am currently developing such a  multidisciplinary PhD level graduate course entitled  “Nonparametric Data Science‘.”  Rather than giving a specialized training on a ‘narrow’ field of statistics (a hedgehog), I seek to provide a ‘comprehensive connected view’ of traditional statistical methods (The Power of being Fox) in such a way that extends to modern complex datasets. I envision a successful implementation of this plan will strengthen the statistical core for data science among our PhD students. Traditional statistical courses start by assuming data generating models. In this course, we will develop new concepts and tools that will systematically design data-models looking at the data (see Breiman, 2001 for more discussion).  The attitude of this course will be `Data’ first then `Science’ (Data + Science, not Science + Data) by demonstrating how data modeling can help us to discover `new’ science, not to validate `old’ science.

Courses Taught.

I have taught both undergraduate and graduate courses at Temple University since the fall semester of 2013.


STAT 8115: Nonparametric Methods (Spring 2018, Fall 2016).  (newly redesigned course)

STAT 9190: Nonparametric Data Science (Spring 2016 and 2017, Special Topics Course).  (newly developed course)

STAT 8001: Probability Theory I  [Google Discussion Group]

STAT 8002: Probability Theory II  [Piazza Online Forum]


STAT 2501: Quantitative Foundations for Data Science (Fall, Spring 2017 and 2018). (newly developed course)

STAT 2103: Statistical Business Analytics (old name: Business Statistics)  [MyPearsonLab]