Many operational problems in data-rich environments can be characterized by three primitives: data on uncertain quantities of interest such as simultaneous demands, concurrent auxiliary data such as recent sale figures, social media attention, or user reviews, and an operational decision to be made with the objective of minimizing costs, maximizing profits, or attenuating risks. By and large, auxiliary data has not been extensively incorporated into OR modeling of decision making under uncertainty. On the other hand, machine learning (ML) has largely focused on leveraging such data for supervised learning. By and large, ML does not address optimal decision-making that is appropriate for operational problems. At the same time, an explosion in the availability of data has enabled applications of ML that predict quantities that are of interest in such problems such as predicting box-office ticket sales based on Twitter chatter. It is not clear how to go from a good such prediction to a good operational decision. In this paper, we combine ideas from ML and OR in developing a theoretical framework and specific methods for prescribing optimal decisions in operational problems based directly on data and leveraging predictive observations. We study the asymptotics of our proposals under sampling assumptions more general than iid. We introduce a metric, the coefficient of prescriptiveness, to measure the prescriptive content of data and the efficacy of a prescription. To demonstrate the power of our approach in a real-world setting we study an inventory management problem faced by the distribution arm of an international media conglomerate, which manages over 1 million unique items at some 100,000 retail locations around the world. We leverage both internal company data and, in the spirit of aforementioned predictive applications, large-scale public data harvested from online sources to prescribe operational decisions that demonstrably outperform baseline measures.
This is joint work with Dimitris Bertsimas, MIT.