4 Wishart Random Matrices
4.1. Concentration for the largest eigenvalue of Wishart matrices
- 数据集矩阵最大的奇异值是一个满足 L 条件的范数, 所以它集中在它的期望值附近.
- 接着我们还需要思考这个期望值到底是多少.
We want to understand the spectral properties of
Thus we can apply our Gaussian concentration inequality for Lipschitz functions. Note that
Suppose
Step 1 Simpler Question
Note that we can write
It's a bit easier to do this for balls than for spheres, so let us write
We want
Step 3
Set
Then we can choose an
Let
then there exist
Thus we have
Hence we need now the concentration inequality only for the finitely many summands on the right-hand side. Note that
thus
This then yields
Put
it is a rough estimate.
We will not pursue this any further, but instead we will now look at the collection of all singular values of
4.2. Eigenvalue distribution of Wishart matrices & Marchenko-Pastur law
Set
So first let us try to shrink it to Lipschitz condition.
Let
Then one has, as for the maximal eigenvalue,
Thus
However, since
For this, the Lipschitz constant is modified as follows:
Note that the estimate
But let us have a closer look on this, as it also reveals the difference between classical and modern regimes.
- In the classical regime,
which would give good concentration. - But in the modern regime,
which does not give good concentration.
- So let's keep the operator norm
in in Section 4.1; for this we already know that we have good concentration around and thus with high probability
By Theorem 3.2, this then gives concentration of around its expected value with
This means in the modern regime, the eigenvalue distribution of concentrates on its average (The scaling factor in ensures that we have a limit for ), then we can get:
Let
Then the histogram of the eigenvalues of
Remark.
- 这里的归一化常数不是伽马分布那种强行归一化计算出来的, 而是通过自洽方程自然推导出来的.
- Note that the statement is of the form
where and𝟙 are the eigenvalues of .
Proving (1) directly is not so clear, but can be achieved by proving analogous statements for other classes of functions. Instead of proving (1) for - (i) all
for all𝟙 , (直接看落在 区间的特征值占比) - (ii) all moments
for all , (证明特征值的各阶矩( 的平均值)都对得上) - (iii) all resolvents
for all . ( denotes the complex upper half plane.) (证明预解式 的平均值对得上)
By concentration, it suffices to prove in each case the version for the average, i.e. one has to prove
Note for this that and (if we restrict them to a compact interval) are Lipschitz functions.
Proof. (by Self-consistent equation)
Step 1
- For Wishart matrices
- For the Marchenko-Pastur distribution
So what we have to prove is the convergence of the Stieltjes transforms:
Let
Let
Step 2
By Resolvent we can get
By Sherman-Morrison Formula we can get
In fact,
So left and right side becomes:
Finally we can calculate
But it turn out without density, then we find a way to calculate out the density of MP distribution:
Step 3
Let
Apply this to
VA Marchenko and LA Pastur, The distribution of eigenvalues in certain sets of random matrices math, Math. USSR-Sbornik 1 (1967), 457–483.↩︎




