In many intelligence agencies, the screening of data into usable information ready for analysis
poses a significant bottleneck. Typically, much more data is available than what can be
screened in the allotted time. We call the staff that screens raw data into usable information
We formulate the problem faced by an intelligence processor — selecting which data to screen
— as an exploration-exploitation problem: the collector has to choose between exploring for
new sources of relevant information and exploiting known sources.
To address the exploration-exploitation problem, we develop a mathematical model of the
collector’s knowledge and examine algorithms that allow the collector to maximize the
discovery of relevant data given a time limit. We computationally test the model and gain
insight into solutions using a simulated intelligence data set based on the Enron social network
and email corpus.
Operations Research & Industrial Engineering
University of Texas at Austin