While deep learning is an increasingly popular form of artificial intelligence used in products and services that impact hundreds of millions of lives—it’s deployed in robots, driverless cars and systems that decide who should go to jail and who should get a credit card—no one quite understands how it actually works.
Tom Goldstein, an associate professor of computer science with an appointment in the University of Maryland Institute for Advanced Computer Studies, recently joined a multi-institutional team of engineers, computer scientists, mathematicians and statisticians who are tasked with unravelling this mystery.
The researchers—from Rice University, Johns Hopkins University, Texas A&M University, the University of Maryland, the University of Wisconsin, UCLA and Carnegie Mellon University—will focus their efforts on developing a theory of deep learning based on mathematical principles.
“Deep learning has allowed researchers and practitioners to accomplish amazing things, but much of this progress has been accomplished through painstaking trial and error because we don't have any grasp on the fundamental mathematical principles that make deep learning methods succeed or fail,” Goldstein says.
Goldstein explains that the research team will build a rigorous mathematical theory for how and why deep learning methods work. He hopes their work will contribute to faster and easier methods for training deep networks, new methods that learn from smaller datasets, and also artificial intelligence (AI) systems that make more reliable, trustable and understandable decisions.
The five-year project is funded by a $7.5 million grant from the Office of Naval Research through the Department of Defense’s Multidisciplinary University Research Initiative (MURI) program.
The MURI-funded researchers plan to focus their efforts on three different perspectives.
One is mathematical. It turns out that deep networks are very easy to describe locally, the researchers say. For example, if you look at what's going on in a specific neuron, it’s actually easy to describe. But you don’t understand how those pieces—literally millions of them—fit together into a global whole. The MURI team will use mathematical models to examine this “local to global” understanding.
A second perspective is statistical. What happens when the input signals have randomness? The research team is interested in being able to predict how well a network will perform when the input signals are changed.
The third perspective is formal methods, or formal verification, a field that deals with the problem of verifying whether systems are functioning as intended, especially when they are so large or complex that it is impossible to check each line of code or individual component.
Richard Baraniuk, the Victor E. Cameron Professor of Electrical and Computer Engineering at Rice University, is principal investigator on the project. He has spent nearly three decades studying signal processing in general and machine learning in particular, the branch of AI to which deep learning belongs.
Baraniuk says that the MURI investigators have each previously worked on pieces of the overall solution, and the grant will enable them to collaborate and drawn upon one another’s work to go in new directions.
“Ultimately, the goal is to develop a set of rigorous principles that can take the guesswork out of designing, building, training and using deep neural networks,” he says.
(This article was adapted from a news release from Rice University)
October 6, 2020
Goldstein Part of $7.5M MURI Award to Study Theoretical Aspects of Deep Learning
Did You Know
UMD's Neutral Buoyancy Research Facility, which simulates weightlessness, is one of only two such facilities in the U.S.