Database Reference
In-Depth Information
such as NIPS, neural nets started to be heard again in the mid 2000s. I had
actually left the field for a few years, not because people were not interested
in neural nets, but because I had other interests that I wanted to pursue. I
worked on image compression between 1996 and around 2002.
Then I came back to machine learning in the early part of 2000. Geoff Hinton,
Yoshua Bengio, and I started what you could call a conspiracy—the deep
learning conspiracy, basically—where we attempted to rekindle the interests
of the community in learning representations as opposed to just learning clas-
sifiers. For the first few years, it was very difficult for us to get any papers
published anywhere, be it computer vision conferences or machine learning
conferences. The work was labeled “neural nets” and basically not interest-
ing for that reason. People just didn't seem interested in digging past the title
essentially. It's only around 2007 or so that things started to take off.
For deep learning, it was still a bit of a struggle for a while, particularly in
computer vision. In computer vision, the transition to deep learning happened
just last year. In speech recognition, it happened about three years ago, when
people started to realize deep learning was working really well and it was
beating everything else, and so there came a big rush to those methods. But it
was a struggle for almost 10 years.
Gutierrez: So you've done vision and audio. What's next?
LeCun: Natural language is what's next. At Facebook, we have quite a lot of
effort going on with deep learning for natural language. That's kind of obvi-
ous though, right? Google also has pretty big efforts in that direction. After
natural language, there's video, and then after that there is the combination of
all of the above. For in video, you very frequently also have audio. People then
make comments on images and videos. So what you'd like to be able to do is
to represent all of those different pieces of content in the same space, so that
things that are talking about the same thing on similar topics end up in the
same region of that space. This is called “embedding.”
Gutierrez: What in your career are you most proud of so far?
LeCun: The thing I'm most proud of is that back in the early 1990s at Bell
Labs, there were a bunch of very smart people working together, and together
we built a check-reading/check-recognition system. But it wasn't just a check-
reading system; it was an entire process for doing end-to-end image recogni-
tion. The data that was available, and the limitation of the computers of the
time drove us to apply these ideas to check recognition. That was basically
one of the practical applications for which we had data, and there were people
willing to actually do the development, commercialize it, and everything.
People still have to catch up with the technology that we used to solve this
problem. The kind of techniques that we used there—the integrated deep
learning convolutional nets in particular, with what we now call “structure
 
Search WWH ::




Custom Search