Dear friends!
In time series analysis, stationarity emerges as a fundamental prerequisite for accuracy. Its absence can skew predictions, rendering models ineffective. Through this article, I intend to dissect the importance of stationarity, guiding you through its nuances and techniques of detection. Are you ready? Let’s go! 🚀
Why is stationarity important?
Stationarity means the statistical properties of a process generating a time series remain consistent. This is foundational as many time-series models, particularly those in forecasting, assume the data is stationary. Economic and financial time series, like stock prices or housing market indices, are affected by numerous factors causing changes in their distributions. For instance, over the course of a decade, a stock might undergo significant volatility due to various corporate decisions, market dynamics, and global economic shifts. Using non-stationary data for predictive modeling might lead to unreliable predictions, as the model could overemphasize recent trends without grasping the stock’s broader behavior. With stationary data, the series remains predictable over time, ensuring models perform consistently on both past and future data. It also guarantees that metrics like mean or variance from the data are reliable.
Definitions of stationarity
Stationarity can be broadly split into two types:
Weak stationarity, often referred to as second-order stationarity, focuses on three principal criteria. Firstly, it demands a constant mean throughout the series, indicating that the average value doesn’t drift over time. This is akin to observing that the average daily returns of a stock remain consistent over the years. Secondly, weak stationarity requires a constant variance, which means that the fluctuations or volatility in the data don’t widen or narrow as time progresses. Lastly, the autocovariance, which measures how the data correlates with its past values, should be time-invariant. In financial contexts, this would suggest that the lagged relationships within stock price movements, for instance, remain consistent over time.
Strong stationarity, a more stringent form, demands that the entire probability distribution of the series remains unchanged even when shifted in time. This means that statistical properties, including higher moments like skewness or kurtosis, of any segment of data, are identical to those of any other segment, irrespective of the time frame. In practical terms, if analyzing property prices, it would mean the distribution of prices in one decade mirrors that of another, maintaining similar patterns and behaviors.
Visual tests for stationarity
Visual inspection can serve as an intuitive first step in determining stationarity. When presented with a graph or a set of graphs, certain patterns emerge that can either suggest or refute the idea of a stationary dataset.
To illustrate, consider four distinct graphs:
- The first graph showcases a stationary time series. Here, we observe a line that neither trends upwards nor downwards and demonstrates stable, consistent fluctuations around a constant mean. This consistency resembles the behavior of certain financial instruments that aren’t influenced by macroeconomic trends and remain stable over time.
- In the second graph, we identify a Time-dependent mean. The line fluctuates, but there’s a noticeable upward drift. This could represent a stock price that, despite its short-term fluctuations, has a long-term upward trajectory, possibly due to positive company growth or favorable market conditions.
- The third graph exhibits Time-dependent variance. Initially, the fluctuations are relatively mild, but they intensify around the middle, and then diminish towards the end. This could symbolize a real estate market witnessing increased volatility during a particular period (e.g., a housing bubble) followed by stabilization.
- Lastly, the graph illustrating Time-dependent covariance shows fluctuations that are concentrated or tighter in the middle and more dispersed at the beginnings and ends. This suggests periods of consistent returns (tighter fluctuations) amidst more volatile times.
Statistical tests for stationarity
Recognizing such patterns visually can set the stage for more intricate statistical tests and analysis. These tests can decisively pinpoint whether a dataset is stationary, helping to validate or override initial visual assessments.
A central concept to understand in stationarity testing is the unit root. Essentially, a unit root denotes a stochastic trend, commonly referred to as a “random walk with drift”. The significance of this is that randomness, by its nature, cannot be predicted. So, if a unit root is present, the time series is non-stationary, making it unpredictable. Conversely, if there’s an absence of a unit root, the time series is stationary.
The next step involves setting up hypotheses for testing. Typically, two competing hypotheses are put forth:
- Null hypothesis (H0) — The time series is stationary (no unit root present).
- Alternative hypothesis (H1) — The time series is not stationary (unit root present).
The decision to accept or reject the null hypothesis is based on either the p-value approach or the critical value approach. For the p-value approach: if the p-value is greater than 0.05, we fail to reject the null hypothesis. If it’s less than or equal to 0.05, the null hypothesis is rejected. On the other hand, when using the critical value approach, if the test statistic is less extreme than the critical value, we maintain the null hypothesis. If it’s more extreme, the null hypothesis is rejected. This approach becomes particularly pertinent when the p-value is borderline significant, hovering around 0.05.
The Augmented Dickey-Fuller (ADF) test, a parametric test, is often employed in this domain. Its hypotheses are:
- Null hypothesis (H0): The time series has a unit root (non-stationary).
- Alternative hypothesis (H1): The time series has no unit root (stationary).
Contrarily, the Kwiatkowski-Phillips-Schmidt-Shin (KPSS) test is a non-parametric test, meaning it makes fewer assumptions about the population distribution and is more flexible, allowing for better adaptation to the characteristics of the time series in question. This makes it especially useful for data sets that might not fit standard distributions. The hypotheses for the KPSS test are:
- Null hypothesis (H0): The time series is stationary because there is no unit root (if p-value > 0.05).
- Alternative hypothesis (H1): The time series is not stationary because there is a unit root (if p-value ≤ 0.05).
Imagine you’re examining the price history of a particular stock over several years. The Augmented Dickey-Fuller (ADF) test would essentially ask, “Is there a persistent trend or drift in this stock’s price, making it unpredictable?” If the ADF confirms there is such a trend, the stock price movement is deemed non-stationary. On the other hand, the KPSS test approaches the data with the opposite assumption. The KPSS test essentially checks, “Does the stock’s price consistently follow a known pattern or trend?” If the KPSS finds that the stock price deviates erratically from this expected pattern, it designates the series as non-stationary. In some cases, the ADF might indicate a time series is stationary, while the KPSS suggests the opposite, or vice versa. Such discrepancies often arise when there are subtle trends or cycles in the data, which one test might capture and the other might overlook. These contrasting outcomes highlight the importance of using both tests to make a comprehensive assessment.
👣Example in Finance: Stock Price Analysis
Suppose an investment analyst has collected monthly returns for a specific stock over the past ten years. Before developing any predictive models, the analyst wants to ensure that the data is stationary.
Visual Tests for Stationarity: First, visual tests are a straightforward method to get an initial sense of the data’s behavior. In R, instead of just plotting the raw data, we can use the decompose()
function which breaks down the time series into trend, seasonal, and residual components.
data_ts <- ts(data, frequency = 12) # monthly data
decomposed_data <- decompose(data_ts)
plot(decomposed_data)
In Excel, visual tests can be done by simply creating a line chart of the stock returns. By observing the chart, if the returns appear to hover around a consistent mean without any discernible long-term trends, it suggests stationarity.
Augmented Dickey-Fuller Test: The ADF test checks for a unit root in the time series, helping us determine if it’s stationary.
library(tseries)
adf_test <- adf.test(data)
print(adf_test$p.value)
In Excel, while there isn’t a direct function for the ADF test, one can use the Analysis ToolPak add-in to perform regression analysis, which can then be used in conjunction with critical values from ADF tables to manually assess stationarity.
KPSS Test: KPSS, being a non-parametric test, checks the null hypothesis that data is stationary around a deterministic trend. Essentially, it evaluates if the data’s patterns and behaviors remain constant.
kpss_test <- kpss.test(data)
print(kpss_test$p.value)
For Excel, there isn’t a direct KPSS function. However, one can employ third-party add-ins or tools that have integrated the KPSS test.
Assessing Stationarity in Excel: Beyond visual tests, Excel doesn’t have built-in functions for complex stationarity tests like ADF or KPSS. However, there are indirect ways:
- Moving Averages: By plotting moving averages alongside the stock returns and observing if the averages remain consistent, one can get an indication of stationarity.
- Variance Check: By splitting the data into two or more segments and comparing their variances, consistent variances across segments hint at stationarity.
Remember, while Excel provides handy tools for basic time series analysis, dedicated statistical software or programming languages like R offer more comprehensive methodologies for intricate tests.
By understanding stationarity and how to test for it, we can ensure that our time-series models are reliable and produce accurate forecasts. In the next article, I will address how we can make a time series stationary, ensuring robustness in our analytical pursuits.