High-dimensional Asymptotics of Feature Learning After One Step of Gradient Descent

Speaker: 

Zhichao Wang

Institution: 

UCSD

Time: 

Wednesday, March 30, 2022 - 2:00pm to 3:00pm

Location: 

510R Rowland Hall

In this talk, I will discuss the spectral properties of a two-layer neural network after one gradient step and their applications in random ridge regression. We consider the first gradient step in a two-layer randomly initialized neural network with the empirical MSE loss. In the proportional asymptotic limit, where all the dimensions go to infinity at the same rate, and an idealized student-teacher setting, we will show that the first gradient update contains a rank-1 ''spike'', which results in an alignment between the first-layer weights and the linear component of the teacher model. By verifying a Gaussian equivalent property, we can compute the prediction risk of ridge regression on the conjugate kernel after one gradient step. We will present two scalings of the first step learning rate. For a small learning rate, we compute the asymptotic risks for the ridge regression estimator on top of trained features which does not outperform the best linear model. Whereas for a sufficiently large learning rate, we prove that the ridge estimator on the trained features can go beyond this ``linear'' regime. Our analysis demonstrates that even one gradient step can lead to an advantage over the initial features. Our theoretical results are mainly based on random matrix theory and operator-valued free probability theory, which will be summarized in this talk. This is recent joint work with Jimmy Ba, Murat A. Erdogdu, Taiji Suzuki, Denny Wu, and Greg Yang.

Adjusted chi-square test for degree-corrected block models

Speaker: 

Arash A. Amini

Institution: 

UCLA

Time: 

Wednesday, May 4, 2022 - 2:00pm to 3:00pm

Host: 

Location: 

510R Rowland Hall

We propose a goodness-of-fit test for degree-corrected stochastic block models (DCSBM). The test is based on an adjusted chi-square statistic for measuring equality of means among groups of $n$ multinomial distributions with $d_1,\dots,d_n$ observations. In the context of network models, the number of multinomials, $n$, grows much faster than the number of observations, $d_i$, corresponding to the degree of node $i$, hence the setting deviates from classical asymptotics. We show that a simple adjustment allows the statistic to converge in distribution, under null, as long as the harmonic mean of $\{d_i\}$ grows to infinity. 

When applied sequentially, the test can also be used to determine the number of communities. The test operates on a (row) compressed version of the adjacency matrix, conditional on the degrees, and as a result is highly scalable to large sparse networks. We incorporate a novel idea of compressing the rows based on a $(K+1)$-community assignment when testing for $K$ communities. This approach increases the power in sequential applications without sacrificing computational efficiency, and we prove its consistency in recovering the number of communities. Since the test statistic does not rely on a specific alternative, its utility goes beyond sequential testing and can be used to simultaneously test against a wide range of alternatives outside the DCSBM family. 

The test can also be easily applied to Poisson count arrays in clustering or biclustering applications, as well as bipartite and directed networks. We show the effectiveness of the approach by extensive numerical experiments with simulated and real data. In particular, applying the test to the Facebook-100 dataset, a collection of one hundred social networks, we find that a DCSBM with a small number of communities (say $ < 25$) is far from a good fit in almost all cases. Despite the lack of fit, we show that the statistic itself can be used as an effective tool for exploring community structure, allowing us to construct a community profile for each network.

https://arxiv.org/abs/2012.15047

Mathematics of synthetic data. III. Superregular random walks and private measures.

Speaker: 

Roman Vershynin

Institution: 

UCI

Time: 

Wednesday, June 1, 2022 - 2:00pm to 3:00pm

Location: 

510R Rowland Hall

In this last talk of the series, we construct a superregular random walk. This will be done by modifying a standard construction of the Brownian motion. Then we will use it to create private synthetic data on the interval. Using sspace-filling curves will allow to extend the construction to higher dimensions. Joint work with March Boedihardjo and Thomas Strohmer, https://arxiv.org/abs/2204.09167

Mathematics of synthetic data. II. Random walks.

Speaker: 

Roman Vershynin

Institution: 

UCI

Time: 

Wednesday, May 25, 2022 - 2:00pm to 3:00pm

Location: 

510R Rowland Hall

In this talk we will reduce the problem of constructing private synthetic data to a construction of a superregular random walk. Such walk locally looks like a simple random walk, but which globally deviates from the origin much slower than the Brownian motion. Joint work with March Boedihardjo and Thomas Strohmer, https://arxiv.org/abs/2204.09167

Mathematics of synthetic data. I. Differential privacy.

Speaker: 

Roman Vershynin

Institution: 

UCI

Time: 

Wednesday, May 18, 2022 - 2:00pm to 3:00pm

Location: 

510R Rowland Hall

This is a series of talks on synthetic data and its privacy. It is meant for beginners. In the first talk we will introduce the notion of differential privacy, construct a private mean estimator, and try (unsuccessfully) to construct a private measure. 

Quantum Unique Ergodicity for generalized Wigner matrices

Speaker: 

Lucas Benigni

Institution: 

University of Chicago

Time: 

Wednesday, May 11, 2022 - 2:00pm to 3:00pm

Location: 

510R Rowland Hall

We prove a strong form of delocalization of eigenvectors for general large random matrices called Quantum Unique Ergodicity. These estimates state that the mass of an eigenvector over a subset of entries tends to the uniform distribution. We are also able to prove that the fluctuations around the uniform distribution are Gaussian for a regime of a subset of entries. The proof relies on new eigenvector observables studied dynamically through the Dyson Brownian motion combined with a novel bootstrap comparison argument.

On some aspects of wave propagation in random media

Speaker: 

Knut Solna

Institution: 

UCI

Time: 

Wednesday, April 20, 2022 - 2:00pm to 3:00pm

Location: 

510R Rowland Hall

I will discuss some aspects of and approaches to modeling waves in random media.  When waves travel through complex media like the turbulent atmosphere, they are affected by the fluctuations in the medium.  The medium may be too complex to describe point-wise and it is convenient to model in terms of a random medium. One then wants to analyze how the wave field is affected by the medium fluctuations and I will discuss this challenge.    

Counting Hamiltonian \ell-Cycles in Dense Hypergraphs

Speaker: 

Liam Hardiman

Institution: 

UCI

Time: 

Wednesday, February 23, 2022 - 2:00pm to 3:00pm

Location: 

340 N Rowland Hall

A classical theorem of Dirac states that any graph on n vertices with minimum degree at least n/2 contains a Hamiltonian cycle, an edge-traversing cycle that visits each vertex in the graph exactly once. Since Dirac's theorem, researchers have found lower bounds on the number of Hamiltonian cycles a graph contains as a function of its minimum degree. These bounds aren't too far away from what one would expect to see in a typical random graph with appropriate edge probability. Here we contribute to the line of research that seeks to extend these results to k-uniform hypergraphs, where cycles are now chains of edges whose consecutive links overlap with one another in some fixed number, \ell, of vertices. We show that the number of so-called \ell-cycles in hypergraphs with sufficiently high co-degree is at least, up to a sub-exponential factor, what one would expect to see in a typical random hypergraph with appropriate (hyper)edge probability.

Joint work with Asaf Ferber and Adva Mond (University of Cambridge).

Universality and matrix concentration inequalities

Speaker: 

Tatiana Brailovskaya

Institution: 

Princeton University

Time: 

Wednesday, April 27, 2022 - 2:00pm to 3:00pm

Location: 

510R Rowland Hall

Random matrices frequently appear in many different fields — physics,
computer science, applied and pure mathematics. Oftentimes the random
matrix of interest will have non-trivial structure — entries that are
dependent and have potentially different means and variances (e.g.
sparse Wigner matrices, matrices corresponding to adjacencies of random
graphs, sample covariance matrices). However, current understanding of
such complicated random matrices remains lacking. In this talk, I will
discuss recent results concerning the spectrum of sums of independent
random matrices with a.s. bounded operator norms. In particular, I will
demonstrate that under some fairly general conditions, such sums will
exhibit the following universality phenomenon — their spectrum will
lie close to that of a Gaussian random matrix with the same mean and
covariance. No special background in random matrix theory will be
necessary for the audience — basic knowledge of probability and linear
algebra are sufficient.

Joint work with Ramon van Handel https://web.math.princeton.edu/~rvan/tuniv220113.pdf

Matrix Concentration: Combinatorial Proof

Speaker: 

March Boedihardjo

Institution: 

UCI

Time: 

Wednesday, February 9, 2022 - 2:00pm to 3:00pm

Location: 

510R Rowland Hall

This is a continuation of my talk last quarter on Sharp Matrix Concentration. In this talk, I will give a proof for a special case of the main result. The proof consists of two parts. The linear algebra part is on a quantitative estimate about noncommutativity. The other part involves combinatorics. If time permits, I will briefly talk about the analytic proof of the main result that works in general. Paper: https://arxiv.org/abs/2104.02662

Pages

Subscribe to RSS - Combinatorics and Probability