]]>

]]>

]]>

]]>

I have been recently asked (by a theoretical computer scientist): “What is the role of Statistics in the development of Theoretical Data Science?”I think this is a fundamental question. If you have any opinion on this please feel free to share with us. Here's my 2 cents:Theory of [Efficient] Computing: A branch of Theoretical Computer Science that deals with how quickly one can solve (compute) a given algorithm. The critical task is to analyze algorithms carefully based on their performance characteristics to make it computationally efficient.Theory of Unified Algorithms: An emerging branch of Theoretical Statistics that deals with how efficiently one can represent a large class of diverse algorithms using a single unified semantics. The critical task is to put together different “mini-algorithms” into a coherent master algorithm.For overall development of Data Science, we need both ANALYSIS + SYNTHESIS. However, it is also important to bear in mind the distinction between the two.Find original post hereSee More

]]>

The Backdrop. Bayesians and Frequentists have long been ambivalent toward each other. The concept of “Prior” remains the center of this 250 years old tug-of-war: frequentists view prior as a weakness that can cloud the final inference, whereas Bayesians view it as a strength to incorporate expert knowledge into the data analysis. So, the question naturally arises, how can we develop a Bayes-frequentist consolidated data analysis workflow that enjoys the best of both worlds?Twin Goals. To develop a “defendable and defensible” Bayesian learning model, we have to go beyond blindly ‘turning the crank’ based on a “go-as-you-like” [approximate guess] prior. A lackluster attitude towards prior modeling could lead to disastrous inference, impacting various fields from clinical drug development to presidential election forecasts. The real questions are: How can we uncover the blind spots of the conventional wisdom-based prior? How can we develop the science of prior model-building that combines both data and science [DS-prior] in a testable manner – a double-yolk Bayesian egg? Unfortunately, these questions are outside the scope of business-as-usual Bayesian modus operandi and require new ideas. Recent Progress. Some recent attempts in this direction can be found in http://tiny.cc/BayesGOF, which laid out a new mechanics of data modeling that effectively consolidates Bayes and frequentist, parametric and nonparametric, subjective and objective, quantile and information theoretic philosophies. However, at a practical level, the main attractions of our “Bayes via goodness of fit” framework lie in its (i) ability to quantify and protect against prior-data conflict using exploratory graphical diagnostics; (ii) theoretical simplicity that lends itself to analytic closed-form solutions, avoiding computationally intensive techniques such as MCMC or variational methods.Associated R-package: Mukhopadhyay, S. and Fletcher, D. (2018). BayesGOF: Bayesian Modeling via Goodness of Fit, [Link] [vignette]. See More

The Backdrop. Bayesians and Frequentists have long been ambivalent toward each other. The concept of “Prior” remains the center of this 250 years old tug-of-war: frequentists view prior as a weakness that can cloud the final inference, whereas Bayesians view it as a strength to incorporate expert knowledge into the data analysis. So, the question naturally arises, how can we develop a Bayes-frequentist consolidated data analysis workflow that enjoys the best of both worlds?Twin Goals. To develop a “defendable and defensible” Bayesian learning model, we have to go beyond blindly ‘turning the crank’ based on a “go-as-you-like” [approximate guess] prior. A lackluster attitude towards prior modeling could lead to disastrous inference, impacting various fields from clinical drug development to presidential election forecasts. The real questions are: How can we uncover the blind spots of the conventional wisdom-based prior? How can we develop the science of prior model-building that combines both data and science [DS-prior] in a testable manner – a double-yolk Bayesian egg? Unfortunately, these questions are outside the scope of business-as-usual Bayesian modus operandi and require new ideas. Recent Progress. Some recent attempts in this direction can be found in http://tiny.cc/BayesGOF, which laid out a new mechanics of data modeling that effectively consolidates Bayes and frequentist, parametric and nonparametric, subjective and objective, quantile and information theoretic philosophies. However, at a practical level, the main attractions of our “Bayes via goodness of fit” framework lie in its (i) ability to quantify and protect against prior-data conflict using exploratory graphical diagnostics; (ii) theoretical simplicity that lends itself to analytic closed-form solutions, avoiding computationally intensive techniques such as MCMC or variational methods.Associated R-package: Mukhopadhyay, S. and Fletcher, D. (2018). BayesGOF: Bayesian Modeling via Goodness of Fit, [Link] [vignette]. See More

]]>