We all know that calculating error bounds on metrics derived from very large data sets has been problematic for a number of reasons. In more traditional statistics one can put a confidence interval or error bound on most metrics (e.g., mean), parameters (e.g., slope in a regression), or classifications (e.g., confusion matrix and the Kappa statistic).
For many machine learning applications, an error bound could be very important. Casson Stallings makes a great point, using an example of a company developing a method of acquiring customers.
Which statement gives a CEO more appropriate information on how to proceed, the answer without the error bound, or with the error bound?
If this is an interesting topic, you can read the full step by step guide on how-to use bag of little bootstraps methodology to compute error bounds on machine learning tasks and access the whole project on Domino.