ORIE Colloquium on 3/10/2015 - Yudong Chen: Robust Low-rank Estimation: Statistical and Computational Challenges

views

Many problems in data analytics can be formulated as estimating a low-rank matrix from noisy data. This task is particularly challenging in the face of uncanonical data corruption and growing size of the datasets. Addressing these challenges requires joint statisticalcomputational considerations: developing methods that are both robust and scalable. In this talk, we consider two examples that exhibit quite different statistical-computational behaviors. The first example is recovering a low-rank matrix when only a small number of its entries are observed, and observations from some columns are completely and arbitrarily corrupted. We propose recovery algorithms based on convex relaxation, which has optimal statistical performance in terms of sample complexity and robustness. The second example is finding densely connected communities in noisy networks, for which we develop efficient algorithms based on convexified MLE. These methods have strong recovery guarantees under classical random graph models, but do not achieve the optimal statistical performance of the intractable MLE, hence demonstrating a computational barrier. While convex optimization methods run in polynomial time, their computational cost is often too high for large real-world problems. We develop a general framework for low-rank estimation using non-convex gradient descent that is highly efficient computationally. We provide rigorous and global theoretical guarantees for the statistical performance of these methods. Our general results apply to the two problems above among others, for which we match the statistical performance achieved by convex relaxation.

Tags

ORIE Colloquium on 3/10/2015 - Yudong Chen: Robust Low-rank Estimation: Statistical and Computational Challenges

Related Media