Checking the stationarity of time series – Applying Machine Learning Algorithms – MLS-C01 Study Guide

Checking the stationarity of time series

Decomposing time series and understanding how their components interact with additive and multiplicative models is a great achievement! However, the more you learn, the more you want to go deeper into the problem. Maybe you have realized that time series without trend and seasonality are easier to predict than the ones with all those components!

That is naturally right. If you do not have to understand trend and seasonality, and if you do not have control over the noise, all you have to do is explore the observed values and find their regression relationship.

A time series with constant mean and variance across a time period is known as stationary. In general, time series with trend and seasonality are not stationary. It is possible to apply data transformations to the series to transform it into a stationary time series so that the modeling task tends to be easier. This type of transformation is known as differentiation.

While you are exploring a time series, you can check stationarity by applying hypothesis tests, such as Dickey-Fuller, KPSS, and Phillips-Perron, just to mention a few. If you find it non-stationary, then you can apply differentiation to make it a stationary time series. Some algorithms already have that capability embedded.

Exploring, exploring, and exploring

At this point, it is important to remember that exploration tasks happen all the time in data science. Nothing is different here. While you are building time series models, you might want to take a look at the data and check whether it is suitable for this type of modeling.

Autocorrelation plots are one of the tools that you can use for time series analysis. Autocorrelation plots allow you to check the correlations between lags in the time series. Figure 6.11 shows an example of this type of visualization.

Figure 6.11 – Autocorrelation plot

Remember, if you are playing with univariate time series, your time series just contains one variable. Therefore, finding autocorrelation across the lags of your unique variable is crucial to understanding whether you can build a good model or not.

And yes, it turns out that, sometimes, it might happen that you do not have a time series in front of you. Furthermore, no matter your efforts, you will not be able to model this data as a time series. This type of data is often known as white noise.

Another type of series that you cannot predict is known as a random walk. Random walks are random by nature, but they have a dependency on the previous time step. For example, the next point of a random walk could be a random number between 0 and 1, and also the last point of the series.

Important note

Be careful if you come across those terms in the exam and remember to relate them to randomness in time series.

With that, you have covered the main theory behind time series modeling. You should also be aware that the most popular algorithms out there for working with time series are known as Auto-Regressive Integrated Moving Average (ARIMA) and Exponential Smoothing (ETS). This book will not go into the details of these two models. Instead, you will see what AWS can offer in terms of time series modeling.