David D. Jensen, University of Massachusetts is involved in US efforts to mine data about individuals who might pose a threat to the USA. ("Evidence Extraction and Link Discovery")
This is "technology not only for 'connecting the dots' that enable the U.S. to predict and pre-empt attacks, but also for deciding which dots to connect." It is among the US government's most controversial research programs.
Jensen's work shows how flexible such powerful software can be. Jensen used two online databases, the Internet Movie Database and the Physics Preprint Archive, to develop tools that would predict whether a movie would gross more than $2 million its opening weekend and would identify authoritative physics authors.
University of Southern California professor Craig Knoblauch said he developed software that automatically extracted information from travel websites and telephone books and tracked changes over time.
ARDA sponsors corporate and university research on information technology for U.S. intelligence agencies. It is developing computer software that can extract information from databases as well as text, voices, other audio, video, graphs, images, maps, equations and chemical formulas. It calls its effort Novel Intelligence from Massive Data.
ARDA said it has not given researchers government or private data and obeys privacy laws.
The project is part of its effort "to help the nation avoid strategic surprise ... events critical to national security ... such as those of September 11, 2001," the office said.
The hope is that it may be possible to develop software that could quickly analyze "multiple petabytes" of data. One petabyte would fill the Library of Congress space for 18 million books more than 50 times. It could hold 40 pages of text for each of the more than 6.2 billion humans on Earth.
Software would have to deal with "typically a petabyte or more" of data. It noted that some intelligence data sources "grow at the rate of four petabytes per month." Experts said those are probably files with satellite surveillance images and electronic eavesdropping results.
Follow me on Twitter: @IanYorston