ORIE Colloquium: Daniel Kuhn (College of Management of Technology at EPFL), March 23, 2021
From Henry Lam
Related Media
A General Framework for Optimal DataDriven Optimizationâ€ť
We propose a statistically optimal approach to construct
datadriven decisions for stochastic optimization problems. Fundamentally, a
datadriven decision is simply a function that maps the available training data
to a feasible action. It can always be expressed as the minimizer of a
surrogate optimization model constructed from the data. The quality of a
datadriven decision is measured by its outofsample risk. An additional
quality measure is its outofsample disappointment, which we define as the
probability that the outofsample risk exceeds the optimal value of the
surrogate optimization model. The crux of datadriven optimization is that the
datagenerating probability measure is unknown. An ideal datadriven decision
should therefore minimize the outofsample risk simultaneously with respect to
every conceivable probability measure (and thus in particular with respect to
the unknown true measure). Unfortunately, such ideal datadriven decisions are
generally unavailable. This prompts us to seek datadriven decisions that
minimize the outofsample risk subject to an upper bound on the outofsample
disappointment  again simultaneously with respect to every conceivable
probability measure. We prove that such Paretodominant datadriven decisions
exist under conditions that allow for interesting applications: the unknown
datagenerating probability measure must belong to a parametric ambiguity set,
and the corresponding parameters must admit a sufficient statistic that
satisfies a large deviation principle. If these conditions hold, we can further
prove that the surrogate optimization model generating the optimal datadriven
decision must be a distributionally robust optimization problem constructed
from the sufficient statistic and the rate function of its large deviation
principle. This shows that the optimal method for mapping data to decisions is,
in a rigorous statistical sense, to solve a distributionally robust
optimization model. Maybe surprisingly, this result holds irrespective of
whether the original stochastic optimization problem is convex or not and holds
even when the training data is noni.i.d. As a byproduct, our analysis reveals
how the structural properties of the datagenerating stochastic process impact
the shape of the ambiguity set underlying the optimal distributionally robust
optimization model.
*This is joint work with Tobias Sutter and Bart Van Parys.
Daniel Kuhn is
Professor of Operations Research at the College of Management of Technology at
EPFL, where he holds the Chair of Risk Analytics and Optimization (RAO). His
current research interests are focused on datadriven optimization, the
development of efficient computational methods for the solution of stochastic
and robust optimization problems and the design of approximation schemes that
ensure their computational tractability. This work is primarily
applicationdriven, the main application areas being engineered systems,
machine learning, business analytics and finance.
Before joining EPFL,
Daniel Kuhn was a faculty member in the Department of Computing at
Imperial College London (20072013) and a postdoctoral research associate in
the Department of Management Science and Engineering at Stanford University
(20052006). He holds a PhD degree in Economics from University of St. Gallen
and an MSc degree in Theoretical Physics from ETH Zurich. He serves as the area
editor for continuous optimization for Operations Research and as an associate
editor for several other journals including Management Science, Mathematical
Programming, Mathematics of Operations Research and Operations Research Letters.
 Tags
