Learn Time Series Analysis with R along with using a package in R for forecasting to fit the real-time series to match the optimal model.

Time Series is the measure, or it is a metric which is measured over the regular time is called as Time Series. Time Series Analysis example are Financial, Stock prices, Weather data, Utility Studies and many more.

The time series model can be done by:

- The understanding of the underlying forces and structures that produced the observed data is done.
- Start to fit a model and also start to forecasting, monitoring or even feedback and feedforward control is done.

In this tutorial, you will be given an overview of the stationary and non-stationary time series models. You will be shown how to identify a time series by calculating its ACF and PACF. The figures of these functions make it possible to judge the stationarity of a time series. We can make a non-stationary series stationary by differentiating it. Knowing the nature of a series, it is now easy to predict future values from a model that the series follows. An illustration of real data that can be found in the TSA package of R will also be part of this tutorial.

Let XX be a random variable indexed to time (usually denoted by tt), the observations {xt,t∈N}{xt,t∈N} is called a time series. NN is the integer set which is considered here as the time index set. NN can also be a timestamp. Stationarity is a critical assumption in time series models, and it implies homogeneity in the series that the series behaves in a similar way regardless of time, which means that its statistical properties do not change over time. There are two forms of stationarity: strong and week forms.

A stationary process {xt,t∈N}{xt,t∈N} is said to be strictly or strongly stationary if its statistical distributions remain unchanged after a shift o the time scale. Since the distributions of a stochastic process are defined by the finite-dimensional distribution functions, we can formulate an alternative definition of strict stationarity. If in every nn, every choice of times t1,t2,…,tn∈Nt1,t2,…,tn∈Nand every time lag kk such that ti+k∈Nti+k∈N, the nn-dimensional random vector (Xt1+k,Xt2+k,…,Xtn+k)(Xt1+k,Xt2+k,…,Xtn+k) has the same distribution as the vector (Xt1,Xt2,…,Xtn)(Xt1,Xt2,…,Xtn), then the process is strictly stationary. That is for hh and xixi

P(Xt1≤x1,Xt2≤x2,…,Xtk≤xk)=F(xt1,xt2,…,xtk)=F(xh+t1,xh+t2,…,xh+tk)=P(Xh+t−1≤x1,Xh+t2≤x2,…,Xh+tk≤xk)P(Xt1≤x1,Xt2≤x2,…,Xtk≤xk)=F(xt1,xt2,…,xtk)=F(xh+t1,xh+t2,…,xh+tk)=P(Xh+t−1≤x1,Xh+t2≤x2,…,Xh+tk≤xk)

for any time shift hh and observation xjxj. If {X−t,t∈N}{X−t,t∈N} is strictly stationary, then the marginal distribution of XtXt is independent of tt. Also,the two-dimensional distributions of (Xt1,Xt2)(Xt1,Xt2) are independent of the absolute location of t1t1 and t2t2, only the distance t1−t2t1−t2 matters. As a consequence, the mean function E(X)E(X) is constant, and the covariance Cov(Xt,Xt−k)Cov(Xt,Xt−k) is a function of kk only, not of the absolute location of kk and tt. At higher order moments, like the third order moment, E[XuXtXv]E[XuXtXv] remains unchanged if one add a constant time shift to s,t,us,t,u.

A univariate time series XtXt is stationary if its mean, variance and covariance are independent of time. Thus, if XtXt is a time series (or stochastic process, meaning random variables ordered in time) that is defined for t=1,2,3,…,nt=1,2,3,…,n and for t=0,−1,−2,−3,…t=0,−1,−2,−3,…, then, XtXt is mathematically weakly stationary if

(i)(ii)(iii)E[Xt]=μE[(X−μ)2]=Var(Xt)=γ(0)=σ2E[(Xt−μ)(Xt−k−μ)]=Cov(Xt,Xt−k)=γ(k)(i)E[Xt]=μ(ii)E[(X−μ)2]=Var(Xt)=γ(0)=σ2(iii)E[(Xt−μ)(Xt−k−μ)]=Cov(Xt,Xt−k)=γ(k)

The first two conditions ((i)(i) and (ii)(ii)) require the process to have constant mean and variance respectively, while (iii)(iii) requires that the covariance between any two values (formally called covariance function) depends only on the time interval kk between these two values and not on the point in time tt.

If a process is Gaussian with finite second moments, then weak stationarity is equivalent to strong stationarity. Strick stationarity implies weak stationarity only if the necessary moments exist. Strong stationarity also requires distributional assumptions. The strong form is generally regarded as too strict, and therefore, you will mainly be concerned with weak stationarity, sometimes known as covariance stationarity, wide-sense stationarity or second order stationarity.

A time series, in which the observations fluctuate around a constant mean, have continuous variance and stochastically independent, is a `random time series`

. Such time series doesn’t exhibit any pattern:

- Observations do not tend upwards or downwards
- Variance does not increase or decrease with time
- Observations do not tend to be large in some periods than others

An example of a stationary random model can be written as

Xt=μ+εtXt=μ+εt

where μμ is a constant mean such that E[Xt]=μE[Xt]=μ and εtεt is the noise term assumed to have a zero mean, constant variance and are independent (also known as white noise).

```
# purely random process with mean 0 and standard deviation 1.5
eps <- rnorm(100, mean = 0, sd = 1)
mu <- 2 # the constant mean
# The process
X_t <- mu + eps
# plotting the time series
ts.plot(X_t, main = "Example of (random) stationary time series", ylab = expression(X[t]))
```

The simulated process fluctuates around the constant mean μ=2μ=2.

The theoretical auto-covariance function (ACF) of a stationary stochastic process is an important tool for assessing the properties of times series. Let XtXt be a stationary stochastic process with mean μμ and variance σ2σ2. The ACF at lag kk, γ(k)γ(k), is

γ(k)=γ(k)γ(0)=γ(k)σ2γ(k)=γ(k)γ(0)=γ(k)σ2

The ACF function is a normalized measure of the auto-covariance and possesses several properties.

- ρ(0)=1ρ(0)=1
- The ACF is an even function of the lags. It means that ρ(k)=ρ(−k)ρ(k)=ρ(−k)
- |ρ(−k)|≤1|ρ(−k)|≤1

The ACF of the above process is represented by the following figure.

```
# Auto-covariance function of the simulated stationary random time series
acf(X_t, main = "Auto-covariance function of X")
```

Note 1. The lack of uniqueness is a characteristic of the ACF. Even if a given random has a unique covariance structure, the opposite is generally not true: it is possible to find more than one stochastic process with the same ACF. This causes specification problems is illustrated in [@jenkinsd].

Note 2: A very special matrix is obtained by the autocorrelation function of a stationary process. It is called the Toeplitz matrix. It is a kind of variance and covariance matrix of order m=t−km=t−k (the lag, including auto-correlations to lag m−1m−1), so it is diagonal, symmetric, positive definite.

⎛⎝⎜⎜⎜⎜1ρ(1)⋯ρ(m−1)ρ(1)1⋯ρ(m−2)ρ(2)ρ(1)⋯⋯⋯⋯⋯ρ(1)ρ(m−1)ρ(m−2)⋯1⎞⎠⎟⎟⎟⎟(1ρ(1)ρ(2)⋯ρ(m−1)ρ(1)1ρ(1)⋯ρ(m−2)⋯⋯⋯⋯⋯ρ(m−1)ρ(m−2)⋯ρ(1)1)

A discrete process {Zt}{Zt} is called a purely random process if the random variables ZtZt form a sequence of mutually independent and identically distributed (i.i.d.) variables. The definition implies the process has constant mean and variance.

γ(k)=cov(Zt,Zt+k)=0,∀k∈−3,−2,−1,0,1,2,3,…γ(k)=cov(Zt,Zt+k)=0,∀k∈−3,−2,−1,0,1,2,3,…

Given that the mean and the autocovariance function (acvf) do not depend on time, the process is second-order stationary.

A process {Xt}{Xt} is said to be a random walk process if Xt=Xt−1+ZtXt=Xt−1+Zt, with ZtZt a purely random process with mean μμ and variance σ2ZσZ2. The process is usually stating at t=0t=0 and we have X1=Z0X1=Z0 which means that X0=0X0=0. We have

X1X2X3Xt=X0+Z1,at t=1=X1+Z2=X0+Z1+Z2,at t=2=X2+Z3=X0+Z1+Z2+Z3,at t=3⋯=X0+∑i=1tZiX1=X0+Z1,at t=1X2=X1+Z2=X0+Z1+Z2,at t=2X3=X2+Z3=X0+Z1+Z2+Z3,at t=3⋯Xt=X0+∑i=1tZi

The first order moment (or the expected value) for this process is equal to

E[Xt]=X0+∑i=1tE[Zi]=X0+tμz=tμzE[Xt]=X0+∑i=1tE[Zi]=X0+tμz=tμz

and the variance

Var(Xt)=tσ2ZVar(Xt)=tσZ2

Note that the mean and variance change with time, so, the process is \textbf{non-stationary}. An example of time series behaving like random walks is share prices.

```
# seed X_0 = 0
X <- 0
# purely random process with mean 0 and standard deviation 1.5
Z <- rnorm(100, mean = 0.5, sd = 1.5)
# the process
for (i in 2:length(Z)){
X[i] <- X[i-1] + Z[i]
}
# process plotting
ts.plot(X, main = "Random walk process")
```

Differencing is the most common method for making a time series data stationary. This is a special type of filtering, particularly important in removing a trend. For seasonal data, first order differencing data is usually sufficient to attain stationarity in a mean. Let Xt={X1,X2,…,Xn}Xt={X1,X2,…,Xn}be non-stationary time series. The stationary tie is obtained as

ΔXt+1=Xt+1−Xtor ΔXt=Xt−Xt−1ΔXt+1=Xt+1−Xtor ΔXt=Xt−Xt−1

which is simply called the first-order difference. If the second-order difference is required, you can use the operator Δ2Δ2 is the difference of first-order differences

Δ2Xt+2=ΔXt+2−ΔXt+1Δ2Xt+2=ΔXt+2−ΔXt+1

```
# differencing and plotting of the random walk process
ts.plot(diff(X))
```

You know that the resulting first order difference fluctuates around a constant mean 0. This is because mathematically

ΔXt+1=Xt+1−Xt=ZtΔXt+1=Xt+1−Xt=Zt

which is stationary because it’s a purely random process with constant mean and constant variance.

Let {Zt}{Zt} be a purely random process with mean zero and variance σ2ZσZ2. The process is said to be a Moving Average of order #q# if

Xt=β0Zt−β1Zt−1−⋯−βqZt−qXt=β0Zt−β1Zt−1−⋯−βqZt−q

where βi,i=1,2,…,qβi,i=1,2,…,q are constants. The random variables Zt,t∈NZt,t∈N are usually scaled so that β0=1β0=1

```
# purely random process with mean 0 and standard deviation 1.5 (arbitrary choice)
Z <- rnorm(100, mean = 0, sd = 1.5)
# process simulation
X <- c()
for (i in 2:length(Z)) {
X[i] <- Z[i] - 0.45*Z[i-1]
}
# process plotting
ts.plot(X, main = "Moving Average or order 1 process")
```

For the MA(1) process, the 3 conditions can be verified as follows:

E[Xt]Var(Xt)γ(k)=0=E[X2t]−0=E[Z2t−2βZtZt−1+β2Z2t−1]=σ2Z+β2σ2Z=cov(Xt,Xt+k)=E[XtXt+k]−E[Xt]E[Xt+k]=E[XtXt+k]=E[(Zt−βZt−1)(Zt+k−βZt−1+k)]=E[ZtZt+k−βZtZt−1+k−βZt−1Zt+k+β2Zt−1Zt−1+k]E[Xt]=0Var(Xt)=E[Xt2]−0=E[Zt2−2βZtZt−1+β2Zt−12]=σZ2+β2σZ2γ(k)=cov(Xt,Xt+k)=E[XtXt+k]−E[Xt]E[Xt+k]=E[XtXt+k]=E[(Zt−βZt−1)(Zt+k−βZt−1+k)]=E[ZtZt+k−βZtZt−1+k−βZt−1Zt+k+β2Zt−1Zt−1+k]

for k=0,γ(0)=Var(Xt)=σ2Z(1+β2)k=0,γ(0)=Var(Xt)=σZ2(1+β2)

for k=1,γ(1)=E[ZtZt+1−βZ2t−βZt−1Zt+1+β2Zt−1Zt]=−βσ2Zk=1,γ(1)=E[ZtZt+1−βZt2−βZt−1Zt+1+β2Zt−1Zt]=−βσZ2

and for k>1,γ(k)=0.k>1,γ(k)=0.

Thus, the MA(1) process has a covariance of zero when the displacement is more than one period. That is it has a memory of only one period.

The ACF for MA(1) is therefore

ρ(k)=γ(k)γ(0)={1,−β1+β2,k =0k ±1ρ(k)=γ(k)γ(0)={1,k =0−β1+β2,k ±1

**Question:** redo the work for MA(2)

It can be seen why the sample autocorrelation function can be useful in specifying the order of a moving average process: the autocorrelation functionï²(k) for the MA(q) process has q non-zero values (significantly different from zero) and is zero for k>qk>q. No restrictions on {βi}{βi} are required for MA to be stationary. However, {βi}{βi} have to be restricted to ensure `invertibility`

.

Let {Zt}{Zt} be a purely random process with mean zero and variance σ2ZσZ2. A process {Xt}{Xt} will be called an autoregressive process of order pp if

Xt=α1Xt−1+α2Xt−2+⋯+αpXt−p+ZtXt=α1Xt−1+α2Xt−2+⋯+αpXt−p+Zt

In the autoregressive process of order pp, the current observation XtXt (today’s return for instance) is generated by a weighted average of past observations going back pp periods, together with a random independent variables but on past values of XtXt and hence autoregressive. These types of processes were introduced by [@greenwood1920inquiry]. The AR(pp) above can be written with a constant mean

Xt=δ+α1Xt−1+α2Xt−2+⋯+αpXt−p+ZtXt=δ+α1Xt−1+α2Xt−2+⋯+αpXt−p+Zt

where μμ is a constant term which relates to being the mean of the time series and α1,α2,…,αpα1,α2,…,αpcan be positive or negative. The AR(pp) is stationary if E[Xt]=E[Xt−1]=⋯=E[Xp]=μE[Xt]=E[Xt−1]=⋯=E[Xp]=μ. Thus,

E[Xt]=μ⇒μ=δ+α1μ+α2μ+⋯+αpμ+0=δ1−α1−α2−⋯−αpE[Xt]=μ=δ+α1μ+α2μ+⋯+αpμ+0⇒μ=δ1−α1−α2−⋯−αp

For the last formulate to be a constant, we consider the condition α1+α2+⋯+αp<1α1+α2+⋯+αp<1.

Consider the case when p=1p=1, then

Xt=αXt−1+ZtXt=αXt−1+Zt

is the first order autoregressive AR(1) which is also known as **Markov process**. Using the backshift operator BXt=Xt−1BXt=Xt−1, you can express AR(1) as an infinite MA process. We have

(1−αB)Xt=Zt(1−αB)Xt=Zt

so,

Xt=Zt1−αB=(1+αB+α2B2+⋯)Zt=Zt+αZt−1+α2Zt−2+⋯=Zt+β1Zt−1+β2Zt−2+⋯Xt=Zt1−αB=(1+αB+α2B2+⋯)Zt=Zt+αZt−1+α2Zt−2+⋯=Zt+β1Zt−1+β2Zt−2+⋯

Then E[Xt]=0E[Xt]=0 and Var(Xt)=σ2Z(1+α2+α4+⋯)Var(Xt)=σZ2(1+α2+α4+⋯). The series converges with the condition |α|<1|α|<1.

**Question**: Given that an AR(1) process Xt=α1X1+ZtXt=α1X1+Zt is a purely random process with mean zero and variance σ2ZσZ2 and αα is a constant with necessary conditions on αα, derive the variance and auto-covariance function of XtXt.

```
# constant alpha
alpha = 0.5
# purely random process with mean 0 and standard deviation 1.5
Z <- rnorm(100, mean = 0, sd = 1.5)
# seed
X <- rnorm(1)
# the process
for (i in 2:length(Z)) {
X[i] <- 0.7*X[i-1]+Z[i]
}
# process plotting
ts.plot(X)
```

**Question:** Express the stationary condition of the AR (2) model regarding parameter values. That is, show the following conditions:

In model building, it may be necessary to mix both AR and MA terms in the model. This leads to mixed autoregressive â moving averages (ARMA) process. An ARMA process which contains ppautoregressive terms and qq moving average terms is said to be of order (p,q)(p,q) and is given by

Xt=α1Xt−1+α2Xt−2+⋯+αpXt−p+Zt−β1Zt−1−β2Zt−2−⋯−βqZt−qXt=α1Xt−1+α2Xt−2+⋯+αpXt−p+Zt−β1Zt−1−β2Zt−2−⋯−βqZt−q

Using the backshift operator B, the equation can be written as

αp(B)Xt=βq(B)Ztαp(B)Xt=βq(B)Zt

where αp(B)αp(B) and βq(B)βq(B) are polynomials of order pp and qq respectively, such that

αp(B)βq(B)=(1−α1B−⋯−αpBp)=(1−β1B−⋯−βpBp)αp(B)=(1−α1B−⋯−αpBp)βq(B)=(1−β1B−⋯−βpBp)

For the process to be invertible, roots of βq(B)βq(B) must lie outside a unit circle. To be stationary, we require the roots of αp(B)=0αp(B)=0 to lie outside a unit circle. It’s also assumed αp(B)=0αp(B)=0 and βq(B)=0βq(B)=0 share no common roots.

This process is of order (1,1)(1,1) and is given by the equation

Xt=α1Xt−1+Zt−β1Zt−1Xt=α1Xt−1+Zt−β1Zt−1

with the conditions |α1|<1|α1|<1 and is needed for stationarity invertibility proof. when α0α0, the arma(1,1) reduces to ma(1) we’ll have an ar(1). process can be transformed into a pure autoregressive representation using backshift p>

(1−α1B)Xt=(1−β1B)Zt(1−α1B)Xt=(1−β1B)Zt

We have

1−α1B1−β1BXtπ(B)Xt=Zt=Zt1−α1B1−β1BXt=Ztπ(B)Xt=Zt

with

π(B)=1−π1B−π2B2−…(1−β1B)(1−π1B−π2B2−…)1−[(π1+β1)B−(π2+β2)B2−(π3+β3)B3]=1−α1B1−β1B=1−α1B=1−α1Bπ(B)=1−π1B−π2B2−…=1−α1B1−β1B(1−β1B)(1−π1B−π2B2−…)=1−α1B1−[(π1+β1)B−(π2+β2)B2−(π3+β3)B3]=1−α1B

By identification of B’s coefficients on both sides of the equation, we get the unknown terms as

πj=βj−11(α1−β1),for j≥1πj=β1j−1(α1−β1),for j≥1

The expectation of XtXt is given by E[Xt]=α1E[Xt−1]E[Xt]=α1E[Xt−1] and under stationarity assumption, we have E[Xt]=E[Xt−1]=μ=0E[Xt]=E[Xt−1]=μ=0. This is useful for the calculation of the auto-covariance function which is obtained as follows:

- XtXt−k=α1Xt−kXt−1+Xt−kZt−β1Xt−kZt−1XtXt−k=α1Xt−kXt−1+Xt−kZt−β1Xt−kZt−1
- Taking the expectation on both sides, we get:
E[XtXt−k]γ(k)=E[α1Xt−kXt−1]+E[Xt−kZt]−β1E[Xt−kZt−1]=α1γ(k−1)+E[Xt−kZt]−β1E[Xt−kZt−1]E[XtXt−k]=E[α1Xt−kXt−1]+E[Xt−kZt]−β1E[Xt−kZt−1]γ(k)=α1γ(k−1)+E[Xt−kZt]−β1E[Xt−kZt−1]

For k=0k=0,

γ(1)⇒γ(1)⇒γ(0)=α1γ(0)+E[Xt−1Zt]−β1E[Xt−1Zt−1]=α1(α1γ(1)+σ2Z−β1(α1−β1)σ2Z)−β1σ2Z=α2γ(1)+(α1−β1)(1−α1β1)σ2Z=(α1−β1)(1−α1β1)σ2Z1−α2=α1(α1−β1)(1−α1β1)σ2Z1−α2+σ2Z−β1(α1−β1)σ2Z=(1+β21−2α1β1)σ2Z1−α2γ(1)=α1γ(0)+E[Xt−1Zt]−β1E[Xt−1Zt−1]=α1(α1γ(1)+σZ2−β1(α1−β1)σZ2)−β1σZ2=α2γ(1)+(α1−β1)(1−α1β1)σZ2⇒γ(1)=(α1−β1)(1−α1β1)σZ21−α2⇒γ(0)=α1(α1−β1)(1−α1β1)σZ21−α2+σZ2−β1(α1−β1)σZ2=(1+β12−2α1β1)σZ21−α2

For k≥2k≥2,

γ(2)=α1γ(k−1)γ(2)=α1γ(k−1)

Hence, the ACF of ARMA(1,1) is:

ρ(k)=⎧⎩⎨⎪⎪⎪⎪1,(α1−β1)(1−α1β1)1+β21−2α1β1,α1ρ(k−1),for k=0for k=1for k≥2ρ(k)={1,for k=0(α1−β1)(1−α1β1)1+β12−2α1β1,for k=1α1ρ(k−1),for k≥2

The ACF of ARMA(1,1) combines the characteristic of both AR(1) and MA(1) processes. The MA(1) and AR(1) parameters are both present in ρ(1)ρ(1). Beyond ρ(1)ρ(1), the ACF of an ARIMA(1,1) odel follows the same pattern as the ACF of an AR(1).**Question:** Find the partial autocorrelation function (PACF) of ARMA(1,1) process.

**Note:** A characteristic of time series processes are given in terms of their ACF and PACF. The most crucial steps in time series analysis, identify and build a model based on the available data, where the ACF and PACF are unknown.

```
# purely random process with mean 0 and standard deviation 1.5
Z <- rnorm(100, mean = 0, sd = 1.5)
# Process
X <- rnorm(1)
for (i in 2:length(Z)) {
X[i] <- 0.35*X[i-1] + Z[i] + 0.4*Z[i-1]
}
# process plotting
ts.plot(X, main = "ARMA(1,1) process")
# ACF et PACF
par(mfrow = c(1,2))
acf(X); pacf(X)
```

Autoregressive Integrated Moving Average Models are time series defined by the equation:

In this section, you will use real-time series to fit the optimal model from it. For that purpose, you’ll use the `forecast`

package. The function `auto.arima`

fits and selects the optimal model from the data and `forecast`

function allows the prediction of `h`

periods ahead.

```
# R packages to be used
library(forecast)
library(TSA)
```

*Example 1:*

```
# Data from TSA package
data("co2")
data("boardings")
# fitting
fit <- auto.arima(co2)
# Time series plot
plot(fc <- forecast(fit, h = 15))
```

**Example 2:**

```
data("boardings")
# fitting
fit2 <- auto.arima(boardings[,"log.price"])
# forecasting
plot(fc2 <- forecast(fit2, h = 15))
```

Greenwood, Major, and G Udny Yule. 1920. “An Inquiry into the Nature of Frequency Distributions Representative of Multiple Happenings with Particular Reference to the Occurrence of Multiple Attacks of Disease or of Repeated Accidents.” Journal of the Royal Statistical Society 83 (2). JSTOR: 255–79.

Jenkins, GM. n.d. “D. G. Watts (1968) Spectral Analysis and Its Applications.” San Francisco.

In this tutorial, you covered many details of the Time Series in R. You have learned what the stationary process is, simulation of random variables, simulation of random time series, random walk process, and many more. Also, you covered Auto-Regression of order pp: Ar(pp), SARIMA(p,d,q(P, D, Q)process, forecasting.

If you like to learn more on Time Series Analysis in R, take DataCamp’s Introduction to Time Series Analysis course.

*Source: https://www.datacamp.com/community/tutorials/time-series-r*

******** Footer text ********* Business Theme.