Given *n* observations *x*1, ..., x*n*, the generalized mean (also called *power mean*) is defined as

The case *p* = 1 corresponds to the traditional arithmetic mean, while *p* = 0 yields the geometric mean, and *p* = -1 yields the harmonic mean. See here for details. This metric is favored by statisticians. It is a particular case of the quasi-arithmetic mean.

Here I introduce another kind of mean called *exponential mean*, also based on a parameter *p*, that may have an appeal to data scientists and machine learning professionals. It is also a special case of the quasi-arithmetic mean. Though the concept is basic, there is very little if any literature about it. It is related to the LogSumExp and the Log semiring. It is defined as follows:

Here the logarithm is in base *p*, with *p* positive. When *p* tends to 0, *mp* is the minimum of the observations. When *p* tends to 1, it yields the classic arithmetic mean, and as *p* tends to infinity, it yields the maximum of the observations.

**Advantages of the exponential mean**

One advantage of the exponential mean is that it always exists even if the observations take on negative values. This is not the case in general for the power mean if *p* is not an integer. Also, the exponential mean is stable under translation, unlike the power mean. That is, if you add a constant *a* to each observation, then the exponential mean is shifted by the same quantity *a*. In short, *mp*(*x*1 + *a*, ..., *xn* + *a*) = *a* + *mp*(*x*1, ..., x*n*). To the contrary, the power mean is stable under multiplication by a constant, while the exponential mean is not.

Finally, the central limit theorem applies both to the power and exponential means, when the number *n* of observations becomes larger and larger.

**Illustration on a test data set**

I tested both means (exponential and power means) for various values of *p* ranging between 0 and 2. See above chart, where the X-axis represents the parameter *p*, and the Y-axis represents the mean. The test data set consists of 10 numbers randomly chosen between 0 and 1, with an average value of 0.53. Note that if *p* = 1, then *mp* = *Mp* = 0.53 is the standard arithmetic mean.

The blue curve in the above chart is very well approximated by a logarithm function, except when *p* is very close to zero or *p* is extremely large. The red curve is well approximated by a second-degree polynomial. Convergence to the maximum of the observations (equal to 0.89 here), as *p* tends to infinity, occurs much faster with the power mean than with the exponential mean. Note that the min(*x*1, ..., *xn*) = 0.07 in this example, and the exponential mean will start approaching that value only when *p* is extremely close to zero.

**Important inequality**

This inequality, valid for the power mean *Mp* and resulting from the convexity of some underlying function, also applies to the exponential mean *mp*:

- If
*p <**q*, then*mp*≤*mq*, and*mp*=*mq*if and only if*x*1 = ... =*xn*.

Proving this inequality is equivalent (see here) to proving that

The derivative of *mp* with respect to *p* is well approximated by a power function, positive everywhere, and suggesting that the inequality is indeed verified. Let us denote this derivative as *m*'*p*. We have (see here):

It is interesting to note that *m*1 is the arithmetic mean and *m*'1 is half the empirical variance of *x*1, ..., *xn*. It would be interesting to see how higher order derivatives of *mp* evaluated at *p* = 1, are related to higher empirical moments of *x*1, ..., *xn*.

**Doubly exponential mean**

Here we mention two generalizations of the exponential mean. The first one, defined as

is called the doubly exponential mean. As *p* tends to 1, then *mp*,*q* tends to *mq*. The other generalization is as follows:

Here *q* can be negative. When *q* = 1, it corresponds to *mp*.

© 2021 TechTarget, Inc. Powered by

Badges | Report an Issue | Privacy Policy | Terms of Service

**Most Popular Content on DSC**

To not miss this type of content in the future, subscribe to our newsletter.

- Book: Applied Stochastic Processes
- Long-range Correlations in Time Series: Modeling, Testing, Case Study
- How to Automatically Determine the Number of Clusters in your Data
- New Machine Learning Cheat Sheet | Old one
- Confidence Intervals Without Pain - With Resampling
- Advanced Machine Learning with Basic Excel
- New Perspectives on Statistical Distributions and Deep Learning
- Fascinating New Results in the Theory of Randomness
- Fast Combinatorial Feature Selection

**Other popular resources**

- Comprehensive Repository of Data Science and ML Resources
- Statistical Concepts Explained in Simple English
- Machine Learning Concepts Explained in One Picture
- 100 Data Science Interview Questions and Answers
- Cheat Sheets | Curated Articles | Search | Jobs | Courses
- Post a Blog | Forum Questions | Books | Salaries | News

**Archives:** 2008-2014 |
2015-2016 |
2017-2019 |
Book 1 |
Book 2 |
More

**Most popular articles**

- Free Book and Resources for DSC Members
- New Perspectives on Statistical Distributions and Deep Learning
- Time series, Growth Modeling and Data Science Wizardy
- Statistical Concepts Explained in Simple English
- Machine Learning Concepts Explained in One Picture
- Comprehensive Repository of Data Science and ML Resources
- Advanced Machine Learning with Basic Excel
- Difference between ML, Data Science, AI, Deep Learning, and Statistics
- Selected Business Analytics, Data Science and ML articles
- How to Automatically Determine the Number of Clusters in your Data
- Fascinating New Results in the Theory of Randomness
- Hire a Data Scientist | Search DSC | Find a Job
- Post a Blog | Forum Questions

## You need to be a member of Data Science Central to add comments!

Join Data Science Central