Time Series
Department of Mathematical Sciences, Aalborg University
There has been a huge increase in the amount of data available.
This has led to the development of new techniques to analyze and extract information from data.
The dynamics of the data can be complex, but it is often the case that the data has some underlying structure.
Today, we will discuss some of these techniques: principal component analysis and factor models.
Principal component analysis (PCA) is a technique used to reduce the dimensionality of a dataset.
It is based on the idea of finding the directions in which the data has the largest variance.
These directions, called principal components, can be used to capture most of the information in the data with fewer variables.
Let \(X_1, X_2,\cdots, X_p\) be a set of variables centered at zero.
The first principal component is the normalized linear combination of the features \[Z_1 = \phi_{11}X_1 +\phi_{21}X_2 +\cdots+\phi_{p1}X_p,\] that has the largest variance.
\(Z_1\) can be written as \(Z_1 = \mathbb{X}\Phi_1\).
The variance of the first principal component is given by \[Var(Z_1) = \frac{1}{n}\Phi_1'\mathbb{X}'\mathbb{X}\Phi_1.\]
Hence, the first principal component solves \[\max_{\Phi_1} \Phi_1'\mathbb{X}'\mathbb{X}\Phi_1,\ \ \ \text{subject to} \ \ \ \Phi_1'\Phi_1 = 1.\]
\[\mathcal{L}(\Phi_1,\lambda_1) = \Phi_1'\mathbb{X}'\mathbb{X}\Phi_1 - \lambda_1(\Phi_1'\Phi_1-1).\]
The first order conditions are given by \[\begin{align} \frac{\partial \mathcal{L}}{\partial \Phi_1} &= 2\mathbb{X}'\mathbb{X}\Phi_1 - 2\lambda_1\Phi_1 = 0,\\ \frac{\partial \mathcal{L}}{\partial \lambda_1} &= \Phi_1'\Phi_1-1=0. \end{align}\]
From the first equation, the first principal component is the eigenvector of \(\mathbb{X}'\mathbb{X}\) with largest eigenvalue, \(\lambda_1\).
That is, the second principal component solves \[\max_{\Phi_2} \Phi_2'\mathbb{X}'\mathbb{X}\Phi_2,\ \ \text{subject to} \ \Phi_2'\Phi_2 = 1, \ \Phi_2'\Phi_1 = 0.\]
Similar derivations as before show that the second principal component is the eigenvector of \(\mathbb{X}'\mathbb{X}\) with the second largest eigenvalue, \(\lambda_2\).
Moreover, the eigenvectors of \(\mathbb{X}'\mathbb{X}\) are orthogonal.
Theorem (Eigenvectors of symmetrical matrices).
Let \(A\) be a symmetrical matrix. Then, the eigenvectors of \(A\) associated to different eigenvalues are orthogonal.
The \(k\)-th principal component is the normalized linear combination of the features that has the \(k\)-th largest variance and is uncorrelated with the previous \(k-1\) principal components.
The \(k\)-th principal component is the eigenvector of \(\mathbb{X}'\mathbb{X}\) with the \(k\)-th largest eigenvalue.
The eigenvectors of \(\mathbb{X}'\mathbb{X}\) are orthogonal.
The principal components are the eigenvectors of the covariance matrix of the data.
The scree plot shows the proportion of variance explained by each principal component.
It is used to determine the number of principal components needed to capture most of the information in the data.
It is defined as the proportion of variance explained by the \(k\)-th principal component, \[\frac{\lambda_k}{\sum_{j=1}^p\lambda_j}.\]