We all know that correlations range from - 1 to +1. What about correlations between random variables taking only on positive values, possibly from a Poisson, Exponential or Gamma joint distribution? You would think that that these multivariate random variables have a hard time having a very negative correlation. Here we focus on a specific example that has practical applications.
Let's assume that we are dealing with a bivariate distribution (X, Y), with the two marginals X and Y having an exponential distribution. What is the most negative correlation that we could have between X and Y? The answer is not -1, indeed it's about -0.645, and the exact value is 1 - (Pi^2) /6. Read this article for a proof, and for more general results. In particular, if you want to generate an even more negative correlation, try with Gamma distributions.
Application
This model has been used for weather predictions: the variables X and Y being respectively the storm cells duration and intensity, typically modeled as independent variables, while actually, the more intense the precipitations, the shorter the duration (thus a negative correlation). So this problem helps develop a more accurate weather prediction system. You can read the detailed paper here.
Related articles:
DSC Resources
Additional Reading
Follow us on Twitter: @DataScienceCtrl | @AnalyticBridge
Tags:
Hi Vincent,
Can you please clarify your very first statement? I do not really know that "correlations range from -1 to +1". I know that it is true for correlations measured by Pearson's correlation coefficient. I also know that Pearson's correlation coefficient only measures linear relations between variables and this very strong practical limitation results in endless attempts to create/introduce other correlation measures but neither is nearly as popular - I believe mostly due to much less straightforward interpretation.
Many thanks,
Michael
Hi Michael,
Yes, I was referring to the classic coefficient of correlation that you study in high school. I agree, it has many drawbacks, and I am myself an advocate of alternative measures of correlation, see for instance this article.
Best,
Vincent
Thank you Vincent, it is a useful clarification. Unfortunately, quite a few people tend to use a tool they better know rather than that appropriate for a particular situation. Pearson's r will return 0 for a perfectly functional relation y=sin(x) but it does not mean that r is not a good tool :)
Kind regards,
Michael
© 2018 Data Science Central™ Powered by
Badges | Report an Issue | Privacy Policy | Terms of Service