.

**Bayesian Model for COVID-19 Spread Prediction**

At present time, there are different methods, approaches, data sets for for modeling COVID-19 spread [1, 2, 3, 4, 5, 6]. For the predictive analytics of COVID-19 spread, we used a logistic curve model. Such model is very popular nowadays. To estimate model parameters, we used Bayesian regression [7, 8, 9]. This approach allows us to receive a posterior distribution of model parameters using conditional likelihood and prior distribution. In the Bayesian inference, we can use informative prior distributions which can be set up by an expert. So, the result can be considered as a compromise between historical data and expert opinion. It is important in the cases when we have a small number of historical data. Probabilistic approach makes it possible to receive the probability density function for the target variable. Logistic curve model with Bayesian regression approach can be written as follows:

where Date0 is a start day for observations in the historical data set, it is measured in weeks. The data for the analysis were taken from here [2]. α parameter denotes maximum cases of coronavirus, β parameter is an empirical coefficient which denotes the rate of coronavirus spread. For solving Bayesian models, numerical Monte-Carlo methods are used. Gibbs and Hamiltonian sampling are the popular methods of finding posterior distributions for the parameters of probabilistic mode [7, 8, 9]. Bayesian inference makes it possible to obtain probability density functions for model parameters and estimate the uncertainty that is important in risk assessment analytics. In Bayesian regression approach, we can take into account expert opinions via information prior distribution. For Bayesian inference calculations, we used pystan package for Stan platform for statistical modeling [9]. The next figure shows the box plots for β parameters of coronavirus spread model for different countries:

Next figures show the predictions for coronavirus spread cases using current historical data:

In practical analytics, it is important to find the maximum of coronavirus cases per day, this point means the estimated half time of coronavirus spread in the region under investigation. New historical data will correct the distributions for model parameters and forecasting results. The ’Bayesian Model for COVID-19 spread Prediction’ package can be loaded here [10] for free use.

**COVID-19 Impact on the Stock Market**

Coronavirus outbreak has a huge impact on the stock market. It is very important , e.g. for forming stable portfolios, to understand how different crises impact stock prices. We consider the impact of coronavirus crisis on stocks and compare it to the crisis of 2008 and market downturn of 2018. For this, we can use the regression approach using OLS regression and Bayesian regression. Bayesian inference makes it possible to obtain probability density functions for coefficients of the factors under investigation and estimate the uncertainty that is important in the risk assessment analytics. In Bayesian regression approach, we can analyze extreme target variable values using non gaussian distributions with fat tails. We took the following time periods for each of crises - crisis 2008: [2008-01-01,2009-01-31], down turn 2018: [2018-10- 01,2019-01-03], coronavirus: [2020-02-18,2020-03-25]. For each of above mentioned crises, we created a regression variable which is equal to 1 in the crisis time period and 0 in other cases. Next figure shows the time series for S&P500 composite index:

As a target variable, we consider the daily price return. Knowing the daily price return, changes in crises periods, one can estimate the ability of investors to understand trends and recalculate portfolios. These results were received using Bayesian inference. For Bayesian inference calculations, we used Python pystan package for Stan platform for statistical modeling [9]. The next figure shows the box plots of impact weights of each crisis on S&P500 composite index:

The next figure shows the tickers with positive price return in coronavirus crisis:

The next figure shows the weights for different crises for arbitrarily chosen stocks:

We calculated the distributions for crises weights using Bayesian inference. The next figure shows the box plots for crisis weights for different stocks:

**Conclusion**

The logistic curve model can be used with Bayesian regression for predictive analytics of coron- avirus spread. Such a model can be effective when the exponential growth of number of coron- avirus confirmed cases takes place. In practical analytics, it is important to find the maximum of coronavirus cases per day, this point means the estimated half time of coronavirus spread in the region under investigation. New historical data will correct the distributions for model parameters and forecasting results. ’Bayesian Model for COVID-19 spread Prediction’ package can be loaded here for free use.

The obtained results show that different crises with different reasons have different impact on the same stocks. It is important to analyze their impact separately. Bayesian inference makes it possible to analyze the uncertainty of crisis impacts. The uncertainty of crisis impact weights can be measured as a standard deviation for weight probability density functions. The uncertainty of coronavirus crisis is larger comparing to other crises that can be caused by shorter analysis time. Knowing the uncertainty allows making risk assessment for portfolios and other financial and business processes. In Bayesian regression, we can receive a quantitative measure for the uncertainty that can be a very useful information for experts in model selection and stacking. An expert can also set up informative prior distributions for stacking regression coefficients of models, taking into account the domain knowledge information.

**References**

[1] IHME COVID, Christopher JL Murray, et al. Forecasting covid-19 impact on hospital bed-days, icu-days, ventilator-days and deaths by us state in the next 4 months. https://www.medrxiv.org/content/10.1101/2020.03.27.20043752v1

[2] COVID19 Global Forecasting (Week 2). Kaggle.Com. https://www.kaggle.com/c/covid19-global-forecasting-week-2

[3] CSSE COVID-19 Dataset. GitHub.Com. https://github.com/CSSEGISandData/COVID-19/tree/master/csse_covid_1...

[4] Coronavirus (Covid-19) Data in the United States. GitHub.Com. https://github.com/nytimes/covid-19-data

[5] COVID-19 reports. https://www.imperial.ac.uk/mrc-global-infectious-disease-analysis/c...

[6] COVID-19 Kaggle community contributions. Kaggle.Com. https://www.kaggle.com/covid-19-contributions

[7] John Kruschke. Doing Bayesian data analysis: A tutorial with R, JAGS, and Stan. Aca- demic Press, 2014.

[8] Andrew Gelman, John B Carlin, Hal S Stern, David B Dunson, Aki Vehtari, and Donald B Rubin. Bayesian data analysis. Chapman and Hall/CRC, 2013.

[9] Bob Carpenter, Andrew Gelman, Matthew D Hoffman, Daniel Lee, Ben Goodrich, Michael Betancourt, Marcus Brubaker, Jiqiang Guo, Peter Li, and Allen Riddell. Stan: A probabilistic programming language. Journal of statistical software, 76(1), 2017.

[10] Bayesian Model for COVID-19 Spread Prediction. GitHub.Com. https://github.com/pavlyshenko/covid19

- Juniper adds Mist AIOps to its 128 Technology-based SD-WAN
- 10 microservices patterns all architects should know
- IBM extends Call for Code for Racial Justice program
- citizen development
- How to manage third-party risk in the supply chain
- Gartner predicts data storytelling will dominate BI by 2025
- AWS Data Exchange and the third-party cloud data marketplace
- Overcome common IoT edge computing architecture issues

Posted 1 March 2021

© 2021 TechTarget, Inc. Powered by

Badges | Report an Issue | Privacy Policy | Terms of Service

**Most Popular Content on DSC**

To not miss this type of content in the future, subscribe to our newsletter.

- Book: Applied Stochastic Processes
- Long-range Correlations in Time Series: Modeling, Testing, Case Study
- How to Automatically Determine the Number of Clusters in your Data
- New Machine Learning Cheat Sheet | Old one
- Confidence Intervals Without Pain - With Resampling
- Advanced Machine Learning with Basic Excel
- New Perspectives on Statistical Distributions and Deep Learning
- Fascinating New Results in the Theory of Randomness
- Fast Combinatorial Feature Selection

**Other popular resources**

- Comprehensive Repository of Data Science and ML Resources
- Statistical Concepts Explained in Simple English
- Machine Learning Concepts Explained in One Picture
- 100 Data Science Interview Questions and Answers
- Cheat Sheets | Curated Articles | Search | Jobs | Courses
- Post a Blog | Forum Questions | Books | Salaries | News

**Archives:** 2008-2014 |
2015-2016 |
2017-2019 |
Book 1 |
Book 2 |
More

**Most popular articles**

- Free Book and Resources for DSC Members
- New Perspectives on Statistical Distributions and Deep Learning
- Time series, Growth Modeling and Data Science Wizardy
- Statistical Concepts Explained in Simple English
- Machine Learning Concepts Explained in One Picture
- Comprehensive Repository of Data Science and ML Resources
- Advanced Machine Learning with Basic Excel
- Difference between ML, Data Science, AI, Deep Learning, and Statistics
- Selected Business Analytics, Data Science and ML articles
- How to Automatically Determine the Number of Clusters in your Data
- Fascinating New Results in the Theory of Randomness
- Hire a Data Scientist | Search DSC | Find a Job
- Post a Blog | Forum Questions

## You need to be a member of Data Science Central to add comments!

Join Data Science Central