.

The use of formal statistical methods to analyse quantitative data in data science has increased considerably over the last few years. One such approach, **Bayesian Decision Theory (BDT)**, also known as Bayesian Hypothesis Testing and Bayesian inference, is a fundamental statistical approach that quantifies the tradeoffs between various decisions using distributions and costs that accompany such decisions. In pattern recognition it is used for designing classifiers making the assumption that the problem is posed in probabilistic terms, and that all of the relevant probability values are known. Generally, we don’t have such perfect information but it is a good place to start when studying machine learning, statistical inference, and detection theory in signal processing. BDT also has many applications in science, engineering, and medicine.

From the perspective of BDT, any kind of probability distribution - such as the distribution for tomorrow's weather - represents a prior distribution. That is, it represents how we expect today the weather is going to be tomorrow. This contrasts with frequentist inference, the classical probability interpretation, where conclusions about an experiment are drawn from a set of repetitions of such experience, each producing statistically independent results. For a frequentist, a probability function would be a simple distribution function with no special meaning.

In BDT a decision can be viewed as a hypothesis deciding where observations of the random variable *Y* come from. For instance, in image analysis you may want to decide if a picture is of a cat or a dog, in medicine you want to see if heart beat is nominal or irregular, or in radar may want to decide if a target is on the map or not. We assume two possible hypotheses (null hypothesis) and (alternate hypothesis) corresponding to two possible probability distributions and on the observation space . We write this problem as versus . A decision rule for versus is any partition of the observation set into sets and . We think of the decision rule as such:

We would like to optimize how we choose so to do so we assign costs to our decisions, which are some positive numbers. is the cost incurred by choosing hypothesis when hypothesis is true. The decision rule is alternatively written as the likelihood ratio L(y) for the observed value of Y and then makes its decision by comparing this ration to the threshold :

where

and

We then define the conditional risk for each hypothesis as the expected (average) cost incurred by the decision rule when that hypothesis is :

is the risk of choosing when is true multiplied the probability of this decision plus choosing when is true multiplied the probability of doing this. Next we assign priori probability that is true unconditioned of the observation, and we assign priori probability that is true. Given the risks and prior probabilities we can then define the Bayes Risk which is the overall average cost of the decision rule:

The optimum decision rule for versus is one that minimizes over all decision rules the Bayes risk. Such as rule is called the Bayes rule. Below is a simple illustrative example of the decision boundary where and are Gaussian, and we have uniform costs, and equal priors.

- Toucan Toco unveils native integration for Snowflake
- Top trends in big data for 2021 and the future
- Common application layer protocols in IoT explained
- HYCU Protégé integrates Kubernetes data protection
- 10 Jenkins alternatives for developers
- Flow efficiency is one of the trickiest DevOps metrics
- Continuous delivery vs. continuous deployment: Which to choose?
- Camel case vs. snake case: What's the difference?
- Advice on intent-based networking and Python automation
- Risk & Repeat: Will the Ransomware Task Force make an impact?

Posted 3 May 2021

© 2021 TechTarget, Inc. Powered by

Badges | Report an Issue | Privacy Policy | Terms of Service

**Most Popular Content on DSC**

To not miss this type of content in the future, subscribe to our newsletter.

- Book: Applied Stochastic Processes
- Long-range Correlations in Time Series: Modeling, Testing, Case Study
- How to Automatically Determine the Number of Clusters in your Data
- New Machine Learning Cheat Sheet | Old one
- Confidence Intervals Without Pain - With Resampling
- Advanced Machine Learning with Basic Excel
- New Perspectives on Statistical Distributions and Deep Learning
- Fascinating New Results in the Theory of Randomness
- Fast Combinatorial Feature Selection

**Other popular resources**

- Comprehensive Repository of Data Science and ML Resources
- Statistical Concepts Explained in Simple English
- Machine Learning Concepts Explained in One Picture
- 100 Data Science Interview Questions and Answers
- Cheat Sheets | Curated Articles | Search | Jobs | Courses
- Post a Blog | Forum Questions | Books | Salaries | News

**Archives:** 2008-2014 |
2015-2016 |
2017-2019 |
Book 1 |
Book 2 |
More

**Most popular articles**

- Free Book and Resources for DSC Members
- New Perspectives on Statistical Distributions and Deep Learning
- Time series, Growth Modeling and Data Science Wizardy
- Statistical Concepts Explained in Simple English
- Machine Learning Concepts Explained in One Picture
- Comprehensive Repository of Data Science and ML Resources
- Advanced Machine Learning with Basic Excel
- Difference between ML, Data Science, AI, Deep Learning, and Statistics
- Selected Business Analytics, Data Science and ML articles
- How to Automatically Determine the Number of Clusters in your Data
- Fascinating New Results in the Theory of Randomness
- Hire a Data Scientist | Search DSC | Find a Job
- Post a Blog | Forum Questions

## You need to be a member of Data Science Central to add comments!

Join Data Science Central