We all know that calculating error bounds on metrics derived from very large data sets has been problematic for a number of reasons. In more traditional statistics one can put a confidence interval or error bound on most metrics (e.g., mean), parameters (e.g., slope in a regression), or classifications (e.g., confusion matrix and the Kappa statistic).

For many machine learning applications, an error bound could be very important. Casson Stallings makes a great point, using an example of a company developing a method of acquiring customers.

*Which statement gives a CEO more appropriate information on how to proceed, the answer without the error bound, or with the error bound?*

If this is an interesting topic, you can read the full step by step guide on how-to use bag of little bootstraps methodology to compute error bounds on machine learning tasks and access the whole project on Domino.

