Every data scientist worth her salt will immediately notice that the biggest Earthquakes (magnitude above 9) took place in the last 60 years or so.

Northridge Earthquake

Most journalists, and even some scientists, will claim that indeed, giant Earthquakes are getting more severe, and more frequent. Even less compelling stories, in particular regarding global warming, based on even weaker or questionable numbers, have been published including in scientific journals.

But is it really the case that Eartquakes are getting worse? Or just a coincidence? Or could a giant Earthquake triggers a few giant Earthquakes 10,000 miles away and 15 years later? Or do giant Earthquakes appear in clusters? It feels it does if you look at the table below, and if yes, we might be entering a quiet period that could last 30 years, after the numerous big ones of the last decade. In short, could the time distribution of Earthquakes be explained?

That's why looking at correlations or time series parameters is not good enough. Potential causes (even if the true cause is yet unknown) should be established before stating that Earthquakes are getting worse.

It turns out that if you look at data since 1900, there is only a 17.5% chance to have been hit by so many giant Earthquakes recently (none of them occurring before 1950). If you look at the entire data set, the probability drops to 8.5%, but the data is less reliable, and maybe fewer people were living close to the worst fault lines (Alaska) back then, to report these events.

Could the data be faulty?

The first things that should come to any data scientist's mind is that the data collection process is possibly faulty, that the magnitude scale changed over time (so comparing two time periods is comparing apples and oranges), and that we are better equipped today at measuring the magnitude of giant Earthquakes (maybe in the past, local sensors would have been destroyed by the Earthquakes).

But even if the data is correct, at least after 1900 (prior to that, most of the data is missing), the chance of observing no giant Earthquakes in the first 50 years, is an event with a 17.5% probability of occurring just by pure chance. So, it could really be a coincidence. The use of additional good quality data (volcanoes, small Earthquakes and other related events) could help answer this question. Our data source is Wikipedia. Below is the list of big Earthquakes since 1900.

Assessing whether more severe Earthquakes is a coincidence, or not

You can use Monte-Carlo simulations to compute the probabilities discussed above. But in this case, a simple computation will do.

  • Using data after 1900: There are m=5 out of n=17 Earthquakes with magnitude 9 or above in the time period in question. So the probability, for a big Earthquake (above 8.5) to be giant (above 9) is p = m/n = 5/17. The probability of observing zero giant Earthquakes before 1950 is thus p^k, where k is the number of Earthquakes during that time period (k=5). The probability turns out to be 17.5%.
  • Using the entire data set: m=6, n=40, and k=28, and p = m/n. Only one giant Earthquake is observed before 1950. The probability of such a high concentration of giant Earthquakes during the most recent decades can be derived using the binomial model: [n! / 1!(n-1)! ] * p^1 * (1-p)^{n-1} + [n! / 0! n!] p^0 (1-p)^n = 8.5%.

If older Earthquakes have their magnitude overestimated, then these two probabilities will be even lower.

For insurance companies, what is the take away?

The increase in giant Earthquakes is likely a coincidence, given the not-so-low probabilities of random occurrence, and the apparent absence of a rational explanation to justify a worsening. On a different note, are these Earthquakes causing more damages? Maybe not, with new Earthquake-resistant constructions, and better Earthquake predictions (gas pipelines and other utilities being shut down 30 seconds before the quake, and alerts sent to the population via text messages). Note that a 9.5 Earthquake is 10 times more powerful than a 8.5, though I don't know if it translates in 10 times more costs.

The insurance companies so far do not offer affordable Earthquake insurance protection.

Maybe a journalist reading this post will write an article entitled 'The next giant Earthquake will be a 10'. Unfortunately, this is the kind of baseless news that sells. We've proved here that the increase in giant Earthquakes might just be a coincidence, just like the natural clustering of points in a random distribution is very much expected. 

DSC Resources

Additional Reading

Follow us on Twitter: @DataScienceCtrl | @AnalyticBridge

Views: 7126


You need to be a member of Data Science Central to add comments!

Join Data Science Central

Comment by Sione Palu on December 29, 2014 at 3:01am

Vincent, this topic is well covered in Physics Complex System Theory. There are tons of papers that have been published on the topic of earthquakes.

One of the earliest paper was from the developer of SOC theory (self organized criticality),  Per Bak et al.

"Earthquakes as a Self-Organized Critical Phenomenon"


Here's a commentary from another pioneer in earthquake SOC phenomenon modelling, Didier Sornette :

"Predicting Earthquakes"


There are abundant papers on earthquakes and the application of complex system theory into it, which can be found via Google Scholar search.

My data science team have been studying & reading about SOC for a while now because we're interested to apply it to some of our data modelling (we think it is applicable), but we haven't dig deep into it yet, but we have collected lots papers on the topic.

© 2021   TechTarget, Inc.   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service