Will the current resurgence in artificial intelligence turn into another AI winter? Two prominent experts in the field — Eric Horvitz and Yann LeCun — don’t think so. “It’s unclear how fast things are going to go with machine intelligence. I’m uncertain myself,” said Horvitz, director at the Microsoft Research laboratory in Redmond, Wash. “But I think it’s pretty clear that we’ll have some big developments soon that will be very valuable.”
Horvitz pointed to two “core developments” that have gotten AI to where it is today. He referred to the first core development as a “revolution of probability” — that is, a move from logic-based reasoning to reasoning under uncertainty. [This development] “stands on the shoulders of disciplines including statistics, operations research and, of course, probability decision science,” Horvitz said at a recent emerging technology conference — dubbed EmTech — hosted by the MIT Technology Review.
But it’s Horvitz’s second core development that will likely strike a chord with CIOs, who should notice parallels between the technological improvements underpinning this latest AI revitalization and those that brought big data to the forefront. “The second has been a revolution in machine learning, but this has largely been fueled in recent years by the almost limitless storage we have for capturing data, as well as the rise of the data sources through the ubiquity and connectivity of the Web,” he said. “That, along with computation, has fueled a renaissance in machine learning, which has become a core pillar of where we are today in AI.”
That’s especially true for one area of AI that’s receiving plenty of buzz these days: deep learning, a type of machine learning described by some proponents as analogous to how the human brain processes information. Yann LeCun, a luminary in the field, a professor at New York University, and the director of AI research at Facebook, described deep learning as an old idea (one that arguably dates back to the 1950s) that is undergoing a renaissance thanks to these technological developments. He pointed specifically to machines that now can handle the large data sets, which are needed to train or “teach” deep learning systems effectively, and to graphics processing units (GPUs), “which are highly parallel — most of the new chips have something like 2,000 cores on them,” LeCun said at EmTech.
The advent of neural networks
In the 1990s, LeCun made a notable mark on a type of deep learning known as convolutional neural networks, the design for which is loosely inspired by the brain’s visual cortex. Convolutional neural networks (convnets) process data in layers to identify an image or text, starting at the lowest layer by identifying the basic features in the unstructured data and passing the findings up to the highest layers where what’s being portrayed is eventually identified. While working at AT&T Bell Laboratories 25 years ago, LeCun built a neural net that could recognize handwritten numbers, a technique that proved so successful, banks implemented the technology at ATMs for check deposits.
The time in the limelight for convnets, however, was fleeting. “The machine learning community moved away from it. They did not believe this idea had any legs,” LeCun said, leading to what he referred to as the “second death of neural nets,” the first having occurred in the 1960s when researchers, including MIT’s Marvin Minsky, “exposed the limitations of the approaches people had taken so far,” he said.
Creating powerful neural nets, it turns out, involves adding more layers of processing to the mix, but in the 1960s, as Minsky argued, the technology was not up to the task. In the 1980s, researchers were still limited by technique. “People were only training networks of two to three layers because that’s all we could afford on the machines we had, mostly because the data sets were small,” LeCun said. “If you try to make the networks too big with little data, they don’t work that well.”
In cases like this, neural nets risk learning by rote only, which won’t amount to anything useful. One exception was LeCun’s convolutional net for handwriting, which had “five, six, seven layers,” he said. But by the late 1990s, the concept of neural nets had fallen out of favor and with it the promise of convnets.
For the record
In an interview earlier this year, Facebook’s Yann LeCun told Lee Gomes of theIEEE Spectrum that, “while deep learning gets an inspiration from biology, it’s very, very far from what the brain actually does. And describing it like the brain gives a bit of the aura of magic to it, which is dangerous. It leads to hype; people claim things that are not true. AI has gone through a number of AI winters because people claimed things they couldn’t deliver.”
“People didn’t believe that by just adding layers, it would work” he said. “That was partly due to the wrong mathematical intuition.”
In the last five years or so, adding multiple layers to neural nets (10 or more) has become the norm, dramatically transforming the technology’s performance and the industry’s perception. Today, “every speech recognition system that runs on your smartphone … ended up using deep learning,” he said. It’s also taken computer vision — the robot version of human vision — by storm.
“Within the space of a year, the entire computer vision community working on image recognition switched from whatever they were using to convolutional nets,” LeCun said, “In my 30 years of research, I’ve never seen anything like this — a technique like this taking over so quickly.”