DARPA seeks mathematical framework to characterize fundamental limits of machine learning

Darpa FUN LOL

Yes, a bizarre logo indeed

30 May 2016 – As my American colleagues enjoy the “officially sanctioned” start of summer this Memorial Day weekend (summer “officially” ends in the U.S. in September on Labor Day weekend, the calendar notwithstanding), I decided to traipse through my DARPA feed to see what was new and exciting and what popped up was (SURPRISE!!) … artificial intelligence.

It’s not easy to put the intelligence in artificial intelligence. Current machine learning techniques generally rely on huge amounts of training data, vast computational resources, and a time-consuming trial and error methodology. Even then, the process typically results in learned concepts that aren’t easily generalized to solve related problems or that can’t be leveraged to learn more complex concepts. The process of advancing machine learning could no doubt go more efficiently—but how much so?

To date, very little is known about the limits of what could be achieved for a given learning problem or even how such limits might be determined. To find answers to these questions, DARPA has announced its Fundamental Limits of Learning (Fun LoL) program which, based on its name and logo, I perceived to be a late April Fool’s Day joke. But … they are serious. The objective of Fun LoL is to investigate and characterize fundamental limits of machine learning with supportive theoretical foundations to enable the design of systems that learn more efficiently. Said Reza Ghanadan, DARPA program manager, in the news release:

What’s lacking is a fundamental theoretical framework for understanding the relationships among data, tasks, resources, and measures of performance — elements that would allow us to more efficiently teach tasks to machines and allow them to generalize their existing knowledge to new situations. With Fun LoL we’re addressing how the quest for the ultimate learning machine can be measured and tracked in a systematic and principled way. As it stands now with machine learning, even a small change in task often requires programmers to create an entirely new machine teaching process. If you slightly tweak a few rules of the game Go, for example, the machine won’t be able to generalize from what it already knows. Programmers would need to start from scratch and re-load a data set on the order of tens of millions of possible moves to account for the updated rules.

These kinds of challenges are especially pressing for the Department of Defense, whose specialized systems typically don’t have large training sets to begin with and can’t afford to rely on trial and error methods, which come at high cost. Additionally, defense against complex threats requires machines to adapt and learn quickly, so it is important that they be able to generalize creatively from previously learned concepts.

According to DARPA, Fun LoL seeks information regarding mathematical frameworks, architectures, and methods that would help answer questions such as:

  • What are the number of examples necessary for training to achieve a given accuracy performance? (e.g., Would a training set with fewer than the 30 million moves that programmers provided to this year’s winning machine have sufficed to beat a Go grand champion? How do you know?)
  • What are important trade-offs and their implications? (e.g., size, performance accuracy, processing power considerations)
  • How “efficient” is a given learning algorithm for a given problem?
  • How close is the expected achievable performance of a learning algorithm compared to what can be achieved at the limit?
  • What are the effects of noise and error in the training data?
  • What are the potential gains possible due to the statistical structure of the model generating the data?

As an historical example of what Fun LoL is trying to achieve, Ghanadan pointed to a mathematical construct called the Shannon-Hartly Theorem that helped revolutionize communications theory. That theorem established a mathematical framework showing that for any given communication channel, it is possible to communicate information nearly error-free up to a computable maximum rate through that channel. The theorem addresses tradeoffs in bandwidth, source data distribution, noise, methods of communication transmission, error correction coding, measures of information, and other factors that can affect determinations of communications efficiency. Said Ghanadan:

Shannon’s theorem provided the fundamental basis that catalyzed the widespread operational application of modern digital and wireless communications. The goal of Fun LoL is to achieve a similar mathematical breakthrough for machine learning and AI.

DARPA’s Fun LoL is seeking information that could inform novel approaches to this problem. Technology areas that may be relevant include information theory, computer science theory, statistics, control theory, machine learning, AI, and cognitive science.

For more information, see the Request for Information on FedBizOpps by clicking here. Deadline for responses is June 7, 2016.

Leave a Reply

Your email address will not be published. Required fields are marked *

scroll to top