ORIE Colloquium: Lijun Ding/Angela Zhou (Cornell Univeristy), March 2, 2021
From Henry Lam
views
From Henry Lam
Title: Low rank matrix optimization
Abstract: This talk consists of two parts:
(1) semidefinite programming with low rank solution; (2) statistical low-rank matrix recovery.
In the first part, I will present a storage optimal and time efficient algorithm, called CSSDP (complementary slackness SDP), in solving weakly constrained semidefinite programs with low rank solutions.
I shall present the algorithm, the use of complementary slackness in designing it, and a comparison of complexities with past solvers.
In the second part, I will present an algorithm, called AVPG (averaging projected gradient), for solving statistical rank constrained problems. I shall present its main application in generalized linear model with rank constraints, the advantage of it over existing algorithms, and idea of the proof for its global linear convergence.
Title: Robust Personalization from Observational Data
Abstract: Learning to make decisions from datasets in realistic environments is subject to practical challenges such as unobserved confounders, missingness, and bias, which may undermine the otherwise beneficial impacts of data-driven decisions. In this talk, I introduce a methodological framework for learning causal-effect maximizing personalized decision policies in the presence of unobserved confounders. Recent work unilaterally assumes unconfoundedness: that there are no unobserved confounders affecting treatment and outcome, which is often untrue for widely available observational data. I develop a methodological framework that accounts for possible unobserved confounding by minimizing the worst-case estimated regret over an ambiguity set for propensity weights. I prove generalization guarantees and a semi-synthetic case study on personalizing hormone replacement therapy based on the parallel WHI observational study and clinical trial. Hidden confounding can lead to unwarranted harm, while the novel robust approach guarantees safety and focuses on well-evidenced improvement.
In the second part of this talk, I highlight follow-up work
on leveraging these ideas for developing robust bounds for off-policy policy
evaluation in batch (offline) reinforcement learning in the infinite-horizon setting.