Home » Technical Topics » Data Science

ARIMA Model (Time Series Forecasting) in a Nutshell

ARIMA Model (Time Series Forecasting) in a Nutshell

Introduction 

Does your business struggle to understand the data in a better way or to predict future trends? Then you€™re not the only one in the business; many fail here. ARIMA can help you forecast and understand the new patterns from the past data using time series analysis. One of the top reasons why the ARIMA model is always in demand is that lagged moving averages smooth the time series data. 

You mostly witness this method in technical analysis to forecast future security prices. To get a better idea how this works, you need to understand several core topics:

Time Series Forecasting

ARIMA Model (Time Series Forecasting) in a Nutshell

Time series forecasting is a trend analysis technique that focuses on cyclical fluctuations analysis, and seasonality issues go through the past data and associated patterns to predict the future trend. Success is not guaranteed in this method, though it throws a hint about future trends. 

Time series forecasting uses Box-Jenkins Model, which involves three methods to predict future data: autoregression, differencing and moving averages (also called p, d, q, respectively). 

The ARIMA Model

ARIMA Model (Time Series Forecasting) in a Nutshell

The Box-Jenkins model is an advanced technique to forecast based on input data from the specified time series and conjointly conferred as an autoregressive integrated moving average method ARIMA (p,d,q). Using the ARIMA model, you can forecast a time series using the past series values. 

The best uses of ARIMA models are to forecast stock prices and earnings growth. 

Nomenclature in ARIMA Model 

As an ARIMA(p,d,q) model, a nonseasonal ARIMA is one that:

  • p represents the number of autoregressive terms,
  • d is the necessary number of nonseasonal changes for stationarity
  • q is the number of lags in the prediction equation.

In terms of y, the general forecasting equation is:

Å·t   =   μ + ϕ1 yt-1 +€¦+ ϕp yt-p – θ1et-1 -€¦- θqet-q

let y denote the dth difference of Y, which means:

If d=0:  yt  =  Yt

If d=1:  yt  =  Yt – Yt-1

If d=2:  yt  =  (Yt – Yt-1) – (Yt-1 – Yt-2)  =  Yt – 2Yt-1 + Yt-2

ARIMA (1,0,0): 

the first-order autoregressive model, if the series is stationary and autocorrelated, it€™s predicted as the simple multiple of its previous value and a constant. And the equation becomes: 

Ŷt  =  μ  +  ϕ1Yt-1

Y then regressed on itself after lagging by one period, meaning Y = 0, plus a constant term.

If the slope coefficient Ф1 is positive and less than 1 in magnitude, the model shows mean-reverting behavior in which the next predicted value to be Ф1 times as far away from the mean as this period€™s value. If Ф1 is negative, the model shows the mean-reverting behavior with alternation of signs. And Y will be below the average next period if it is above the same period. 

ARIMA (0,1,1) with constant: 

After implementing the SES model as the ARIMA model, it gains flexibility; first, the estimated MA (1) coefficient allowed to be negative: corresponds to a smoothing factor more prominent than 1, which forbids in SES model-fitting procedure. Second, you can add a constant term in the ARIMA model to estimate an average non-zero trend. 

Ŷt   =  μ  + Yt-1  – θ1et-1

How to Make a Series Stationary in Time Series Forecasting? 

ARIMA Model (Time Series Forecasting) in a Nutshell

The most simplified approach to make it stationary is to differentiate it and subtract the previous value from the correct value. Depending upon the series complexity, you may require more than one differentiation. 

The value of d has to be the minimum number of differentiating to make the series stationary. Therefore the value of d has to be zero, i.e. (d = 0) 

AR and MA Models in terms of (p), (q), and (d):

ARIMA Model (Time Series Forecasting) in a Nutshell

AR(p): AutoRegression: a robust model that uses the dependent relationship between the current observation and previous observations. It utilizes the past values in the regression equation for the time series forecasting

I (d) Integration: Makes the process stationary  with a differentiation (subtracting the previous value from the current value for the d number of times till it becomes (d = 0)

MA (q): Moving Average: utilize the dependency between an observation and a residual error from the moving average model when applied to lagged observation. The moving average method draws the error of the model as the combination of previous faults. And the order q represents the number of terms in the model. 

How to Handle if a Time Series Analysis is Slightly Under or Over Differenced:

The time series method at this point may be slightly under-differentiated, and when you differentiate it one more time, it can then become over-differentiated. When the series is under-differentiated, adding one or more additional AR terms usually makes it up. And when it is over-differentiated, try adding further MA terms to get the balance. 

Accuracy Metrics in Time Series Analysis 

ARIMA Model (Time Series Forecasting) in a Nutshell

The commonly used accuracy metrics to evaluate the accuracy of the forecasting:

  • Mean Absolute Percentage Error (MAPE)
  • Mean Error (ME)
  • Mean Absolute Error (MAE)
  • Mean percentage Error (MPE)
  • Root Mean Square Error (RMSE)
  • Lag 1 Autocorrelation of Error (ACF1)
  • Correlation between the Actual and the Forecast (Corr)
  • Min-Max Error (MinMax)

Final Words

Time series forecasting is a classic method to understand futuristic trends and patterns, although success is not guaranteed. But to be at the top, businesses need regular analysis of previous and ongoing trends to understand the future trends, and that€™s where time series forecasting comes into action. 

And Time Series Forecasting ARIMA model uses autoregression and moving averages methods to predict the accurate results followed by accuracy metrics. In a nutshell, you learned in-depth about the ARIMA model, its terminology, making a series stationary, handling time series under and over differentiated, followed by the accuracy metrics. 

Leave a Reply

Your email address will not be published. Required fields are marked *