<![CDATA[Math]]>

<![CDATA[Math]]>http://github.com/dylang/node-rsssite-lib/media/favicon.jpgMathWebpage HTML Export plugin for ObsidianSat, 14 Mar 2026 06:39:28 GMTSat, 14 Mar 2026 06:39:27 GMT60<![CDATA[4 Wishart Random Matrices]]>Remark 数据集矩阵最大的奇异值是一个满足 L 条件的范数, 所以它集中在它的期望值附近. 接着我们还需要思考这个期望值到底是多少. We want to understand the spectral properties of , i.e. the eigenvalues of , or the singular values of . Let us first restrict to the largest singular value: It is Lipschitz: for , Thus we can apply our Gaussian concentration inequality for Lipschitz functions. Note that corresponds to .Theorem. 4.1. 标准数据一定可以中心化 Suppose is a standard Gaussian random matrix, i.e. . Then Proof.
By Theorem. 3.2 Gaussian concentration for Lipschitz functionsQuestion 中心值是多少? Step 1 Simpler Question Note that we can write But there are infinitely many terms. Too hard to control! So we go to finitely many conditions by approximating all and by elements from -nets. It's a bit easier to do this for balls than for spheres, so let us write and let now be an -net for i.e. such that Step 2 We want to be as small as possible; it is easy to see that there exists an -net with Construct an -net by choosing a new point such that Thus all are disjoint and 你会注意到由于体积大小的限制, 只能是有限的. 这确保了是一个 -net，因为如果存在一个点使得对于所有都有，那么这个就可以在我们的构造中被选作 . 于是我们便构造出了满足条件的 -net.Step 3 Set then . Then we can choose an -net for and an -net for with and . Let and be the maximizer for (note that finite dimensions & compact unit ball the supremum is indeed a maximum); then there exist and such that ThenThus we have Hence we need now the concentration inequality only for the finitely many summands on the right-hand side. Note thatthus or now with replaced with for ,This then yields Put , then and thus

it is a rough estimate.We will not pursue this any further, but instead we will now look at the collection of all singular values of or of all eigenvalues of ; i.e., we want now to understand the asymptotics of the histograms of the eigenvalues.Set are independent vectors with & , then What can we say about the eigenvalues of ?So first let us try to shrink it to Lipschitz condition.Let be the eigenvalues of a symmetric matrix .

Then one has, as for the maximal eigenvalue, Thus i.e. the maps for are Lipschitz and thus also the map is Lipschitz. However, since is our matrix with independent Gaussian entries, we are interested in the mappingFor this, the Lipschitz constant is modified as follows:
Note that the estimate is not helpful, since we know that . (Reason) is not a useful bound.But let us have a closer look on this, as it also reveals the difference between classical and modern regimes. In the classical regime, which would give good concentration. But in the modern regime, which does not give good concentration. So let's keep the operator norm in in Section 4.1; for this we already know that we have good concentration around and thus with high probability
By Theorem 3.2, this then gives concentration of around its expected value with This means in the modern regime, the eigenvalue distribution of concentrates on its average (The scaling factor in ensures that we have a limit for ), then we can get:

Theorem. 4.2 Marchenko-Pastur Law [1] Let & . If . (观察次数数据维数) Then the histogram of the eigenvalues of converges to the Marchenko-Pastur density where & . Remark. 这里的归一化常数不是伽马分布那种强行归一化计算出来的, 而是通过自洽方程自然推导出来的. Note that the statement is of the form where and are the eigenvalues of . Proving (1) directly is not so clear, but can be achieved by proving analogous statements for other classes of functions. Instead of proving (1) for (i) all for all , (直接看落在区间的特征值占比) (ii) all moments for all , (证明特征值的各阶矩（的平均值）都对得上) (iii) all resolvents for all . ( denotes the complex upper half plane.) (证明预解式的平均值对得上) By concentration, it suffices to prove in each case the version for the average, i.e. one has to prove Note for this that and (if we restrict them to a compact interval) are Lipschitz functions. Proof. (by Self-consistent equation) Step 1Definition. 4.4 Stieltjes Transform For Wishart matrices For the Marchenko-Pastur distribution So what we have to prove is the convergence of the Stieltjes transforms: And take some prepares for it:Lemma. 4.3 Let be a symmetric matrix with eigenvalues . Then, for any , we have where is the normalized trace on .
By The eigenvalues of ResolventLemma. 4.6 很重要的一个技巧 Let be a standard Gaussian vector in and a deterministic or random matrix independent from . Then Proof. Let . We have Step 2
By Resolvent we can get
By Sherman-Morrison Formula we can getIn fact, can be replaced by where is arbitrary. Thus we haveSo left and right side becomes:Finally we can calculate But it turn out without density, then we find a way to calculate out the density of MP distribution:Step 3Lemma 4.7 (Stieltjes Inversion Formula). Let be a continuous probability density on . Then its Stieltjes transform has a continuous extension to and Proof. For all , we have and thus Apply this to then we get the form of the Marchenko-Pastur distribution as claimed in Theorem 4.2

VA Marchenko and LA Pastur, The distribution of eigenvalues in certain sets of random matrices math, Math. USSR-Sbornik 1 (1967), 457–483.↩︎ ]]>4-wishart-random-matrices.html随机矩阵与机器学习/4 Wishart Random Matrices.mdFri, 13 Mar 2026 14:38:17 GMT<figure><img src="."></figure>