Colorings of the Middle Layers of the Hamming Cube

Speaker: 

Gwen McKinley

Institution: 

UCSD

Time: 

Wednesday, December 7, 2022 - 2:00pm to 3:00pm

Location: 

510R Rowland Hall

There is a rich and growing body of literature dedicated studying typical colorings, independent sets, and more general homomorphisms in regular bipartite graphs. Much of this literature has been devoted to the Hamming cube and the discrete torus, where very strong structural and enumerative results are known. However, a number of the techniques that have been used rely heavily on the specific structure of these graphs. Here, we consider the middle two layers of the Hamming cube, which have slightly less "nice structure" than the entire Hamming cube, and ask for the typical structure of a q-coloring (where q is any constant). When q is even, we prove analogous structural and enumerative results to those known for the Hamming cube. In this talk, I will discuss some of our techniques, and future directions to generalize this work to other graphs. This project is joint with Lina Li and Jinyoung Park.

Overcoming the curse of dimensionality for solving high-dimensional Hamilton-Jacobi partial differential equations or optimal control problems using neural networks

Speaker: 

Tingwei Meng

Institution: 

UCLA

Time: 

Wednesday, October 5, 2022 - 2:00pm

Location: 

510R Rowland Hall

Hamilton-Jacobi PDEs and optimal control problems are widely used in many practical problems in control engineering, physics, financial mathematics, and machine learning. For instance, controlling an autonomous system is important in everyday modern life, and it requires a scalable, robust, efficient, and data-driven algorithm for solving optimal control problems. Traditional grid-based numerical methods cannot solve these high-dimensional problems, because they are not scalable and may suffer from the curse of dimensionality. To overcome the curse of dimensionality, we developed several neural network methods for solving high-dimensional Hamilton-Jacobi PDEs and optimal control problems. This talk will contain two parts. In the first part, I will talk about SympOCNet method for solving multi-agent path planning problems, which solves a 512-dimensional path planning problem with training time of less than 1.5 hours. In the second part, I will show several neural network architectures with solid theoretical guarantees for solving certain classes of high-dimensional Hamilton-Jacobi PDEs. By leveraging dedicated efficient hardware designed for neural networks, these methods have the potential for real-time applications in the future. These are joint works with Jerome Darbon, George Em Karniadakis, Peter M. Dower, Gabriel P. Langlois, and Zhen Zhang.

Strong freezing of the binary perceptron model

Speaker: 

Shuangping Li

Institution: 

Stanford University

Time: 

Wednesday, October 26, 2022 - 2:00pm to 3:00pm

Location: 

510R Rowland Hall

We consider the binary perceptron model, a simple model of neural networks that has gathered significant attention in the statistical physics, information theory and probability theory communities. We show that at low constraint density (m=n^{1-epsilon}), the model exhibits a strong freezing phenomenon with high probability, i.e. most solutions are isolated. We prove it by a refined analysis of the log partition function. Our proof technique relies on a second moment method and cluster expansions. This is based on joint work with Allan Sly.

Detecting Hidden Communities by Power Iterations with Connections to Vanilla Spectral Algorithms

Speaker: 

Jiapeng Zhang

Institution: 

USC

Time: 

Wednesday, November 30, 2022 - 2:00pm to 3:00pm

Location: 

510R Rowland Hall

Community detection in the stochastic block model is one of the central problems of graph clustering. In this setup, spectral algorithms have been one of the most widely used frameworks for the design of clustering algorithms. However, despite the long history of study, there are still unsolved challenges. One of the main open problems is the design and analysis of ``simple'' spectral algorithms, especially when the number of communities is large. 

In this talk, I will discuss two algorithms. The first one is based on the power-iteration method. Our algorithm performs optimally (up to logarithmic factors) compared to the best known bounds in the dense graph regime by Van Vu (Combinatorics Probability and Computing, 2018). 

Then based on a connection between the powered adjacency matrix and eigenvectors, we provide a ``vanilla'' spectral algorithm for large number of communities in the balanced case. Our spectral algorithm is as simple as PCA (principal component analysis).

This talk is based on joint works with Chandra Sekhar Mukherjee. (https://arxiv.org/abs/2211.03939)

Estimation of the covariance matrix in the presence of outliers

Speaker: 

Nikita Zhivotovskiy

Institution: 

UC Berkeley

Time: 

Wednesday, November 16, 2022 - 2:00pm to 3:00pm

Host: 

Location: 

510R Rowland Hall

Suppose we are observing a sample of independent random vectors, knowing that the original distribution was contaminated, so that a fraction of observations came from a different distribution. How to estimate the covariance matrix of the original distribution in this case? In this talk, we discuss an estimator of the covariance matrix that achieves the optimal dimension-free rate of convergence under two standard notions of data contamination: We allow the adversary to corrupt a fraction of the sample arbitrarily, while the distribution of the remaining data points only satisfies a certain (rather weak) moment equivalence assumption. Despite requiring the existence of only a few moments, our estimator achieves the same tail estimates as if the underlying distribution were Gaussian. Based on a joint work with Pedro Abdalla.

Phase diagram of Stochastic Gradient Descent in high-dimensional two-layer neural networks

Speaker: 

Ludovic Stephan

Institution: 

EPFL

Time: 

Wednesday, November 9, 2022 - 2:00pm to 3:00pm

Location: 

510R Rowland Hall

Despite the non-convex optimization landscape, over-parametrized shallow networks are able to achieve global convergence under gradient descent. The picture can be radically different for narrow networks, which tend to get stuck in badly-generalizing local minima. Here we investigate the cross-over between these two regimes in the high-dimensional setting, and in particular, investigate the connection between the so-called mean field/hydrodynamic regime and the seminal approach of Saad & Solla. Focusing on the case of Gaussian data, we study the interplay between the learning rate, the time scale, and the number of hidden units in the high-dimensional dynamics of stochastic gradient descent (SGD). Our work builds on a deterministic description of SGD in high-dimensions from statistical physics, which we extend and for which we provide rigorous convergence rates. https://arxiv.org/abs/2202.00293

Large deviations for projections of high-dimensional measures

Speaker: 

Yin-Ting Liao

Institution: 

UC Irvine

Time: 

Wednesday, September 28, 2022 - 2:00pm to 3:00pm

Location: 

510R Rowland Hall

Random projections of high-dimensional probability measures have gained much attention in asymptotic convex geometry and high-dimensional statistics. While fluctuations at the level of the central limit theorem have been classically studied, only more recently has an inquiry into large deviation principles for such projections been initiated. In this talk, I will review existing work and describe our results on large deviations. I will also talk about sharp large deviation estimates to obtain the prefactor apart from the exponential decay in the spirit of Bahadur and Ranga-Rao. Applications to asymptotic convex geometry and a range of examples including $\ell^p$ balls and Orlicz balls would be given. This talk is based on several joint works with S. S. Kim and K. Ramanan.

Optimal minimization of the covariance loss

Speaker: 

Vishesh Jain

Institution: 

Stanford University

Time: 

Thursday, June 2, 2022 - 2:00pm

Location: 

510R Rowland Hall

Let $X$ be a random vector valued in $\mathbb{R}^m$ such that $\|X\|_{2} \leq 1$ almost surely. In this talk, I will discuss two proofs -- one based on the pinning lemma from statistical physics and another based on randomized rounding -- showing that for every $k \geq 3$, there exists a sigma algebra $\mathcal{F}$ generated by a partition of $\mathbb{R}^{m}$ into $k$ sets such that
    \[\|\operatorname{Cov}(X) - \operatorname{Cov}(\mathbb{E}[X\mid\mathcal{F}])
    \|_{\mathrm{F}} \lesssim \frac{1}{\sqrt{\log{k}}}.\]
This estimate is optimal up to the implicit constant, and improves a previous result of Boedihardjo, Strohmer, and Vershynin, obtained in connection to the design of accurate, privacy-preserving synthetic data, by a factor of $\sqrt{\log\log{k}}$. Joint work with Ashwin Sah (MIT) and Mehtaab Sawhney (MIT).

Spectral asymptotics for contracted tensor ensembles

Speaker: 

Jorge Garza-Vargas

Institution: 

UC Berkeley

Time: 

Wednesday, April 13, 2022 - 2:00pm to 3:00pm

Location: 

510R Rowland Hall

Let $T_{d, N}$ be a random symmetric Wigner-type tensor of dimension $N$ and order $d$. For unit vectors $u_N^{(1)}, \dots, u_{N}^{(d-2)}$ we study the random matrix obtained by taking the contracted tensor $\frac{1}{N} T_{d, n} \left[u_N^{(1)}\otimes \cdots \otimes u_N^{(d-2)} \right]$ and show that, for large $N$, its spectral empirical distribution concentrates around a semicircular distribution whose radius is an explicit symmetric function of the $u_i^N$. We further generalize this result by then considering a family of contractions of $T_{d, N}$ and show, using free probability concepts, that its joint distribution is well-approximated by a non-commutative semicircular family when $N$ is large. This is joint work with Benson Au (https://arxiv.org/abs/2110.01652).

High-dimensional Asymptotics of Feature Learning After One Step of Gradient Descent

Speaker: 

Zhichao Wang

Institution: 

UCSD

Time: 

Wednesday, March 30, 2022 - 2:00pm to 3:00pm

Location: 

510R Rowland Hall

In this talk, I will discuss the spectral properties of a two-layer neural network after one gradient step and their applications in random ridge regression. We consider the first gradient step in a two-layer randomly initialized neural network with the empirical MSE loss. In the proportional asymptotic limit, where all the dimensions go to infinity at the same rate, and an idealized student-teacher setting, we will show that the first gradient update contains a rank-1 ''spike'', which results in an alignment between the first-layer weights and the linear component of the teacher model. By verifying a Gaussian equivalent property, we can compute the prediction risk of ridge regression on the conjugate kernel after one gradient step. We will present two scalings of the first step learning rate. For a small learning rate, we compute the asymptotic risks for the ridge regression estimator on top of trained features which does not outperform the best linear model. Whereas for a sufficiently large learning rate, we prove that the ridge estimator on the trained features can go beyond this ``linear'' regime. Our analysis demonstrates that even one gradient step can lead to an advantage over the initial features. Our theoretical results are mainly based on random matrix theory and operator-valued free probability theory, which will be summarized in this talk. This is recent joint work with Jimmy Ba, Murat A. Erdogdu, Taiji Suzuki, Denny Wu, and Greg Yang.

Pages

Subscribe to RSS - Combinatorics and Probability