This article is chunk from one of my blog posts on Arima time series forecasting with Python It is a pretty extensive tutorial and until and unless you are not really interested in learning in and outs of about ARIMA time series forecasting don't bother to click.
But I do wanted to share this list of 5 very useful metrics for a quick read about how one can evaluate forecasting errors while working with time series data. Here we also learn the situations when one measure fails and the other succeeds. In a hope that you like this chuck. I am being lazy and just copy pasting some of the interesting points from the original article with the hope to reach out to the reader of data science central with new and refreshed information.
An error is a difference between the actual value and its forecast. Here residuals are different than the forecast error for two reasons. First residuals are calculated on the training dataset, whereas forecast errors are calculated on the test or validation dataset. Second, the forecast involves multiple steps, whereas residuals involve single step. Some of the metrics which we can use to summarise the forecasting errors are given below. But before that, let us look at the formula for calculating error. Here P represents the predicted/forecasted values.
Both MAE and RMSE are scale-dependent errors. This means that both errors and data are on the same scale. What does this mean to us? This means we cannot use these measures to compare the results of two different time series forecasts with different units.
If you look at the formula closely, you will realize that if Y is zero, then the MAPE tends to become infinite or undefined(a typical problem of divide by zero). What does this mean? It means that we should not use MAPE if our time series have zero values. Another disadvantage that MAPE has is that it puts bigger penalties on negative errors than positive errors.
Now that we have briefly touched upon some of the most popular methods of calculating forecasting errors let’s look at what packages and functions can be used in Python to generate these statistics.
# Calculating MAEmae = np.mean(np.abs(actual - ))# Calculating MAPE
mape = np.mean(np.abs(actual - forecast)/np.abs(actual))
# Calculating WMAPE
wmape = sum(np.abs(actual - forecast))/sum(actual)
Thank You for reading. Happy Learning!