Every data scientist worth her salt will immediately notice that the biggest Earthquakes (magnitude above 9) took place in the last 60 years or so.
Most journalists, and even some scientists, will claim that indeed, giant Earthquakes are getting more severe, and more frequent. Even less compelling stories, in particular regarding global warming, based on even weaker or questionable numbers, have been published including in scientific journals.
But is it really the case that Eartquakes are getting worse? Or just a coincidence? Or could a giant Earthquake triggers a few giant Earthquakes 10,000 miles away and 15 years later? Or do giant Earthquakes appear in clusters? It feels it does if you look at the table below, and if yes, we might be entering a quiet period that could last 30 years, after the numerous big ones of the last decade. In short, could the time distribution of Earthquakes be explained?
That's why looking at correlations or time series parameters is not good enough. Potential causes (even if the true cause is yet unknown) should be established before stating that Earthquakes are getting worse.
It turns out that if you look at data since 1900, there is only a 17.5% chance to have been hit by so many giant Earthquakes recently (none of them occurring before 1950). If you look at the entire data set, the probability drops to 8.5%, but the data is less reliable, and maybe fewer people were living close to the worst fault lines (Alaska) back then, to report these events.
Could the data be faulty?
The first things that should come to any data scientist's mind is that the data collection process is possibly faulty, that the magnitude scale changed over time (so comparing two time periods is comparing apples and oranges), and that we are better equipped today at measuring the magnitude of giant Earthquakes (maybe in the past, local sensors would have been destroyed by the Earthquakes).
But even if the data is correct, at least after 1900 (prior to that, most of the data is missing), the chance of observing no giant Earthquakes in the first 50 years, is an event with a 17.5% probability of occurring just by pure chance. So, it could really be a coincidence. The use of additional good quality data (volcanoes, small Earthquakes and other related events) could help answer this question. Our data source is Wikipedia. Below is the list of big Earthquakes since 1900.
Assessing whether more severe Earthquakes is a coincidence, or not
You can use Monte-Carlo simulations to compute the probabilities discussed above. But in this case, a simple computation will do.
If older Earthquakes have their magnitude overestimated, then these two probabilities will be even lower.
For insurance companies, what is the take away?
The increase in giant Earthquakes is likely a coincidence, given the not-so-low probabilities of random occurrence, and the apparent absence of a rational explanation to justify a worsening. On a different note, are these Earthquakes causing more damages? Maybe not, with new Earthquake-resistant constructions, and better Earthquake predictions (gas pipelines and other utilities being shut down 30 seconds before the quake, and alerts sent to the population via text messages). Note that a 9.5 Earthquake is 10 times more powerful than a 8.5, though I don't know if it translates in 10 times more costs.
The insurance companies so far do not offer affordable Earthquake insurance protection.
Maybe a journalist reading this post will write an article entitled 'The next giant Earthquake will be a 10'. Unfortunately, this is the kind of baseless news that sells. We've proved here that the increase in giant Earthquakes might just be a coincidence, just like the natural clustering of points in a random distribution is very much expected.