Past, Present and Future of AI / Machine Learning (Google I/O ’17)

Past, Present and Future of AI / Machine Learning (Google I/O ’17)


[APPLAUSE] DIANE GREENE: Hi, everybody. I’m Diane Greene. I run Google’s phenomenal Cloud. I’m on the board of Alphabet,
and I’m incredibly proud today to be here moderating this
panel of just absolute leading experts, leading
researchers, and experts in artificial intelligence
and machine learning. We’re going to
structure our panel talking about past,
present, and future, closing with some personal
reflections on the industry and our careers. And before I do
that, I’m just going to quickly introduce everybody. First is Francoise Beaufays. She’s principal
scientist at Google and is the leader of speech
recognition at Google, something everybody uses. [APPLAUSE] Been at Google 12 years. Second is Fei-Fei Li. She’s the chief scientist
of Google Cloud. [APPLAUSE] Bringing AI and ML to
companies all over the world, also head of Stanford’s
AI Lab, and inventor of imageNet and
imageNet Challenge, which really contributed
to some of our developments in deep learning and AI. And she’s also a
champion of STEM and AI and the founder of AI for All. Next, Fernanda Viegas? She’s a senior staff
researcher at Google, she’s a computational
designer, she focuses on the scientific
and collaborative aspects of information visualization,
co-leads the Big Picture data visualization group, she’s
part of Google Brain, and she’s also known for her
visualization-based artwork, which is part of the permanent
collection of the Museum of Modern Art in New York. [APPLAUSE] And then, let me introduce
Daphne Koller, who today is chief computing officer
at Calico, Calico Labs, which is a part of Alphabet,
working to give people longer and healthier lives. She spent 18 years at Stanford– she led the AI group
there, she cofounded Coursera, which is the
largest platform for MOOC’s massively online courses– open online courses. And you know, Daphne was
one of “Time Magazine’s” 100 most influential
people in 2012, she won a MacArthur
Foundation award, she won the inaugural
ACM prize for computing, she’s a member of the Academy of
Arts and Sciences, the Academy of Engineering–
and those are just some of the proof points
of her excellence. [APPLAUSE] So we’ll start with a
historical perspective. And Daphne, as one of the most
prominent and prolific authors of machine learning
research papers, can you give us your
perspective on how we’ve transitioned to deep learning? DAPHNE KOLLER: So I think
the deep learning revolution is a truly exciting enabler
that we’re seeing today in so many aspects of so
many real world problems. But that revolution
came out of an outgrowth of a lot of machine learning
research that lead up to it. So prior to deep learning,
there were probably about 10 or 15 years of very
hard work in developing models that were maybe
more hand-crafted that required a lot more thought
and a lot more prior knowledge. And we really had to sync
through the specifics of the model and how it
relates to the domain. Because when you don’t
have a lot of data, you have to replace that
with a lot of human intuition on how the model ought
to be constructed. As we’ve gotten more and
more data in certain domains, text and images being I think
the two prime examples– and speech of course, as well– we’ve started to replace a lot
of that need for human insight with more and more data
that counterbalances that. But the techniques that were
developed in those 10 or 15 years are still pivotal
today, both in the methods themselves– those optimization
algorithms that were developed over the last 10 or
15 years are still a key component of what enables
deep learning to be successful. And I think that
while we might like to think that big
data is at this point the solution to everything, it’s
a solution in certain domain areas but, in others, we
still are unfortunately in the medium or sometimes
even small data regime. And so they’re still
definitely a need for balancing human intuition
about the domain with the data that were acquiring and
coming up with a model that incorporates the best of both. DIANE GREENE: Thank you. And Fei-Fei, can we also
get your perspective? You’re running the
Stanford AI Lab, your imageNet work was seminal,
and now you’re bringing– you’re looking at bringing AI
to every company in the world. What is going on
with that transition? FEI-FEI LI: Right. Thank you, Dianne. So just a little bit of
a historical perspective that AI in all sciences
of human civilization is actually a very young field. We’re about 60 years old,
but the very question of the quest for
intelligence, in my opinion, is what’s at the root
of AI’s inspiration. And that dates back
since the beginning of the dawn of civilization. So about 60 years ago,
when machines started to compute and
calculate at that time very simple arithmetics, already
thinkers like Alan Turing challenged humanity with the
question, can machines think? Can machines have intelligence? So about 60 years ago,
leading computer scientists like Marvin Minsky, Joe
McCarthy, and many others got together and really
jump-started the field that today we know as AI. The AI that the
founding fathers saw was very different
technically 60 years ago. But it was the same core
dreams which is making machines to learn, to reason, to
think, to perceive, to speak, to communicate. And AI has gone through several
waves of technical development, from first order logic
to experts systems to the early waves of
machine learning to today– the deep-learning revolution. I would say the past 60 years,
I call it the in-vitro AI, or AI in-vitro. It’s the 60 years
there, as a field, we laid the foundation
of the questions we ask, the sub-fields that are
essential to AI’s quest like, robotics, computer vision,
natural language processing, speech processing, [? comp ?]
[? bio, ?] and so on. But also the way we measure the
progress to understand our data and discover the tool sets. So around 2010, around that
time, thanks to the convergence of the maturing of statistical
machine learning tools, the convergence of
big data brought to us by the internet,
and by the sensors, and the convergence of computing
that Moore’s law carried us to much better hardware. These three pillars
came together and lifted AI from
the in-vitro stage into what I call
the in-vivo stage. AI in-vivo is where AI is making
a real impact to the world. It’s just the beginning. Every single industry
at Cloud, at Google, we see is going through a
transformation because of data, because of AI, and
machine learning. And this is what I see
as the historical moment that AI is going to impact
and transform the field. But I also do want to say
it’s just the beginning. The tools and the
technologies we have developed in
the field of AI is really the first few drops
of water in a vast ocean of what AI can do. We cannot over-promise this,
but there should be tremendous excitement that we can do, a lot
more work to do to make this AI in-vivo happen DIANE GREENE: I share
your excitement. This in-vivo state,
I mean companies are putting virtual renditions
of themselves in the cloud and then they’re using AI to
do things nobody ever thought was possible. And AI’s being used everywhere,
not just in the cloud. So thank you. If we dive a little
deeper, Francoise, you’ve been at the frontier
of speech recognition and now in applications,
speech recognition is actually becoming
almost commonplace. Can you take us through
that transition? FRANCOISE BEAUFAYS:
Yes, certainly. So I joined Google
about 12 years ago. And a bunch of us
came with this vision of doing something useful and
fun with speech recognition. Speech recognition had been
around for quite awhile. All of us had some
background in the field, but we wanted to
do something fun. And that was hard because speech
wasn’t the quality it is now. So we started with
fairly limited products where the task of recognizing
what the person says wasn’t too, too complicated. We were just trying to push
the envelope a little bit but not much. Because we needed to
bring it to a place where the product was successful
enough that people would want to use our application,
and then we could start folding
data into the models and keep it trading from there. So we built what
we called GOOG-411. I don’t know if any
of you remembers that, but it was just a
phone application. You would call a number,
and then it would say, hey, what city and state? And you would say what
you were interested in, and then it would ask
about the business that you were interested in. You would say that
name, and then it would offer to connect you
to that business in that city and state. So again, picture that
it’s 12 years ago. There’s no iPhone,
no Android phone, all you have is that little flip
phone that you put to your ear. So it was very basic. Fortunately,
leadership at Google was really visionary
about this technology and really encouraged us
to push the boundaries as much as we could. And so we were successful
with this first application. But then, the iPhone
and the Android phone came, so everything
changed, obviously. Now there was visual feedback. So we started thinking
about other applications, and that was voice search– so Google search by voice. And then we started doing
dictation and having a little microphone in
every possible entry point in your phone, so
you could do everything with your voice. And more recently,
we’ve moved into trying to enable speech
recognition within your home with devices like Google Home. And because people are asking
for more and more tasks to be fulfilled through voice,
that was a really good entry point into the whole
assistance story, where instead of just enabling you to do very
small things with your voice. Now you can ask questions,
you can phrase them in natural language,
you can get really Google to be your assistant
without this cumbersome physical keyboard input. DIANE GREENE: Thank you. Fernanda, you said you
wanted democratized data visualization, and that’s sort
of inextricably linked to data. How did you get to that? What are the needs you see for
data visualization analytics? And how’s that evolved? FERNANDA VIEGAS:
Yeah, so I started working in data visualization
over 10 years ago. And when I did, it
was much different. It was much harder to
do data visualization, the machines were
not nearly as good, and there wasn’t as
much data out there that was publicly available. That started to
change, and now we find ourselves in
an environment where people are interacting
with data visualization sort of everywhere. It’s really exciting. It’s been amazing to
see, like, journalism take on data
visualization and talk about really
complex stories when they talk about statistics. We always joke that
data visualization is sort of that gateway
drug to statistics. It’s like you’re
doing statistics without even noticing that
you are, because we’re just visually so good at picking
up patterns and outliers and so forth. So data visualization
has been on this trend of becoming more democratized. And also, I really
believe that people have– we have been increasing
people’s ability to take in data and numeracy. And so data visualization has
had a role to play in that. In terms of AI, it’s
been really interesting. Because we saw a major jump
when Geoff Hinton and colleagues proposed the first sort of
blockbuster visualization for AI, t-SNE. It’s a technique
that allows you– so one of the big challenges
with AI and with machine learning is that these are
systems that work in very high dimensional spaces. And those are really hard for
us as humans to understand. So visualization is one way
that you can sort of peak and try to understand what’s
happening in these systems. And these techniques, such
as the one that Geoff Hinton developed, allows us– they allow us to
sort of understand how things are
clustering together, what are the relationships
between different concepts, and how the systems are sort
of resolving the data that they are ingesting. So I’d say that was a major– there was major progress,
and the beginning– as Fei-Fei was saying,
I feel like were also in the beginning of this
relationship between how visualization can help AI. DIANE GREENE: Thank you. Now we’re going to switch to
a slightly more technical set of answers to what’s
going on in the present. Francoise, maybe
we’ll start with you and talking about speech
recognition technologies, and what have the
transitions been, and what were the challenges,
and what are they today. How have the challenges evolved? Yeah. FRANCOISE BEAUFAYS: Yeah, sure. So speech recognition is
really complex, right? It’s difficult to recognize
what you’re saying. Each one of us has a different
voice, has a different accent. We’re speaking in
different environments. So all that contributes to
the richness of the voice. And I think mostly
for that reason, speech recognition has always
been based on machine learning. There hasn’t been,
or not much of, an earlier phase that was
in machine learning base. It’s just that type
of machine learning has been evolving over time. And we kept making progress in
the field for the last three decades, but I
think one inflection point has been the adoption
of neural networks. And that has happened maybe
eight years ago or so, maybe a little bit less. But the early research
in speech recognition using neural networks
happened a long time before. There was a lot of
activity in the field. There were a lot of
promising results; however, there wasn’t
the complete support to really make it happen. And so neural nets were a
little bit abandoned for awhile, and speech recognition
kept improving with more basic methods
like Gaussian Mixer Models and whatnot. And then, when we started really
evolving into deep neural nets, it was a big effort from
an engineering viewpoint. We had to deal with
latency issues, with size, with training
capabilities, and so on. And eventually, when
deep neural networks became a reality, when we
launched them when we really had them in production,
that opened the path to a whole bunch of
other improvements. Because now, we
had the capability of having that complex machinery
there behind the technology. And so we could
move very quickly from one neural
architecture to the next. And so we started looking into
recurrent neural networks, such as LSTM, we looked into
convolution neural networks– and so CTC-based
sequence modeling– we have a whole bunch
of sequence modeling new implementations coming up. With Google Home, we have
the neural beam forming. And so essentially,
what’s happening is that the moving
to the neural network space has opened this
incredible capability for innovating the
core technology that powers our systems and
keep optimizing and giving all of you, in whichever
language you’re speaking, the best possible
accuracy we can. DIANE GREENE: OK, well neural
nets for speech recognition to neural nets for extending our
lives and making us healthier. A fairly open-ended
question for you, Daphne– why does Calico need one of the
top researchers in the world in molecular biology– you know, biological computing,
and also machine learning? As the chief computing officer,
what are you doing over there? DAPHNE KOLLER: So
many of you may not know of Calico
because we’ve been a little bit under the radar. So Calico is one of the Alphabet
companies, the first one that was spun out of Google. And it aims to understand
the problem of aging and to help people live
longer and healthier lives. Now when you look at
aging, you realize that it’s actually the single
largest risk factor for death. And I know that seems kind of
funny when you think about it, but it’s true for almost
every disease that happens after the age of 40. That as you grow
older, year after year, the risk from that disease
increases exponentially every year, whether
it be diabetes or cardiovascular
disease, or cancer. All of these increase
exponentially. No one knows why that is. Why is it that every year
of life after the age of 40 puts us at an increased risk
for each of those diseases? And in order to
understand that, we really need to study the
biological systems that exhibit aging at
the molecular level all the way through
the systems level and figure out what it is
that’s causing us to age. Because I don’t think
we’ll live forever, but maybe we can live longer
and healthier by interventions. One of our earliest
scientists, Cynthia Kenyon, who came over to
Calico from UCSF, showed that with a
single gene mutation in the nematode C. elegans,
you can extend its lifespan by something like 30% to 50%. And not only does
the worm live longer, it lives as if it were a
healthy, young worm in terms of reproductive health,
and movement, and so on. So can we do
something like that? That would allow humans
to live healthier? So that would be really cool,
but in order to do that, there’s a whole lot of
understanding that we still need to gain. And in order to do that,
we need to gather data about all of those
systems, all of which age. Yeast age, worms age,
flies, mice, humans– what is it that we all have in
common at the molecular level? So fortunately, scientists have
been able, over the last 20 years, to devise a whole slew
of measurement modalities that allow us to get an
understanding, or at least data, regarding
systems as they age. And that includes
techniques like sequencing and microfluidics at the
low level, and imaging, all the way through
to things like devices that track movement
and allows wearables to track movements and see how
systems change as they age. But no human being
has the capability to put together data at these
different modalities that range all the way
from subcellular to entire human populations. All these different
modalities that include DNA, and RNA, and mass
spec, and imaging, and so on. All of the time scales that are
involved from the subcellular scales all the way to the scales
of an entire human lifespan. How do you put
all these together into a coherent picture
of what makes us age and what interventions
are the most likely to be successful in slowing that aging
process and making it better? So that ability to interpret
the data and make use of it really requires a
true partnership between the scientists who
are collecting and getting intuitions about these
processes and the machine learning people who
can help construct models that can synthesize and
put the whole thing together. And neither of these communities
can be successful on its own. I was one of the fortunate
people who entered this field in its very early stages. So I’ve been working in the
field of computational biology since on the early 2000s. And as such, whereas
you could say that I am native in the
language of machine learning, you could say that
I have fluency in the biological language. And as such, it
allows me to work with the scientists
of Calico to create a true partnership between
those two disciplines and build models that,
as I mentioned earlier, is so important to combine the
best of both worlds– the best of big data, but also the
best of human intuition. Because biology’s so complex
that I don’t think that even with the amounts of data
that we’re collecting today, we’ll be able to reconstruct
biology de novo from data alone. You need the data, but you
also need the intuition of some of the world’s best scientists. And so by working together
at a place like Calico, we can get some
of those insights as well as some of the
enormous amounts of data that are currently being collected. And I’ll come back to that
later to really construct an in-depth understanding
of the biology of aging and at the same
time try and predict which interventions
might be helpful. DIANE GREENE: Thank you, Daphne. I feel we should pause. A lot of profound thinking. [APPLAUSE] Hang on to your hats! We’re going to jump
back to vision. And Fei-Fei, just
the other day, you were quoted in
TechCrunch as saying, “Vision is the
killer app of AI.” And so what do you mean by that? And what does it mean to
democratize AI, and what does that have to do with the Cloud? FEI-FEI LI: Yeah, so
yes, I was actually trying to be provocative,
and I stand by it. The quote is that, while
many people are asking for the killer app
of computer vision, I will say the killer vision
is the killer app of AI. So let me qualify
that by two reasons. The first reason
comes from nature. 540 million years ago,
a remarkable event happened in animal evolution. For some odd reason, the
number of animal species went from very
few simple species to an explosive
increase of the variety in the types of animals. It was considered the big bang
of evolution or the Cambrian explosion. And the zoologists were
puzzled for many, many decades about why this happened. And recently, a very
convincing and prominent theory conjectured it was
the onset of eyes. Animal vision. When eyes were first
developed in animals, suddenly animal life
became proactive. There’s predators and preys,
and the whole evolution just changed. 540 million years later,
humans are the most intelligent visual animals. In fact, nature devoted
half of our brain for visual processing
because of its importance. So that’s one thread
of one evidence. Another piece of evidence comes
from technology and the world we live in. If you look at
our internet today where data is awash,
while YouTube alone sees 300 plus hours
of the videos uploaded every single minute, and
it’s estimated more than 80% of the entire cyberspace is
in some kind of pixel form. And look at the sensors. The biggest data form
that sensors capture are in some kind of
images, whether it’s visible spectrum or a spectrum
outside of visible lights. From biology labs to hospitals,
from self driving cars to surveillance cameras– everywhere, the pixel format– data in pixel format is
the most invaluable data for consumers and companies. At Cloud, I had the chance to
talk to a lot of customers. I have been all about the
demand of image recognition, video processing,
video analytics. So it’s really an exciting
time for computer vision. Again, it’s just similar
to speech recognition. Thanks to the progress of
deep neural net, vision has really taken off as a field
that’s made a lot of progress. In the past 10
years, between 2010– or seven years– to 2017, I
would say that the biggest problems in computer vision is
the basic perception tasks– object recognition, image
tagging, object detection, we already have, you see,
products coming out of it– Google Photos, pedestrian
detection in self-driving cars, and all this. But the next
phasing in investing in computer vision
technology, in my opinion, is really vision
plus X. Vision is so fundamental in
communication and language. How do we speak stories? How do we tag and index videos? So the connection and interplay
between vision and language is going to be
extremely interesting. Then vision and biological
sciences, whether we’re– the throughput of data coming
from biology, and health care, and medicine in vision
form is phenomenal. And be it radiology
or laboratories. And I think there is a huge
opportunity for vision to play. And one last example I also
want to give is robotics. Speaking as a
researcher, there is a lot of excitement
now happening in the area of
vision and robotics. We’ve been doing robotics for
as long as AI ever existed, but robots are still
not where they are. To a large extent, it’s because
of its primitive perception system. And I think vision
can play a huge role. So basically, I
do think vision is one of the most important
elements of machine intelligence and also for the
transformation of enterprise and companies. DIANE GREENE: Thank you. Great perspective. [APPLAUSE] I need to be careful, because
we’re running out of time. So Fernanda, how
does vision help visualization and visualization
help machine learning? And maybe to save time,
you could from that go into where you see
the future of where you can take the visualization. FERNANDA VIEGAS: Sure. So yeah, so to piggyback
on Fei-Fei’s answer here, we have this amazingly
sophisticated vision system. We might as well use it to
understand what these machines are doing, right? So machine learning runs on tons
of data, tons of statistics, and probability. Well, it turns out that data
visualization can kind of be a secret weapon in trying
to understand what’s happening. And why do we care? Why should we care? We should care because of a
bunch of different reasons. One is interpretability. Can you interpret what’s
coming out of your models? Second is debugability. Better understanding what’s
happening with your models will allow you to
then debug them. And then finally,
there’s also education. Visualization is already
playing an important role when it comes to education
about machine learning. And I also have a
final education piece there that I’m
very excited about, which is, when we start
to understand better, when we use visualization
to understand better what the
systems are doing, then can we learn from them? Can we become better
professionals, better domain experts in whatever. If I’m a doctor, if I’m an
architect, whatever it is, how can I learn from these
very specific systems and then be better
as a professional? Another thing about
visualization that I think is really powerful
and really important to keep track of is the fact
that by using visualization, we’re always keeping the
human in the loop, right? And that is huge. And as we build
autonomous systems, we want to make sure that
they are behaving well. And so visualization
can be helpful there. I want to tell you a very
quick anecdote about a moment– a scientific moment when
visualization showed us something we didn’t know
before about a machine learning system. So last year, Google deployed
its multilingual translate system, and it was great. It was this really
exciting moment of just putting a ton
of different languages in one system and having the
system somehow figure out how to translate from every
pair of languages. The extra bonus was
that it was able to do what is called zero shot
translation, where it was able to translate from
pairs of languages it had not necessarily
seen before. So one of the
fundamental research questions that the experts
doing those systems had was, how is the system resolving
this space of multilingual data? Is the system creating
something that looks kind of like a model
over here for English, and a model over
here for Spanish, and another one for Portuguese? Or is the system doing
something very different? Where it kind of
mixes everything up in the same spaces,
and it’s maybe learning something
about the semantics and the meaning of words
and not necessarily what language it comes
from or what language I’m translating to? So what we did is, we
built a visualization to look into this. And the really
exciting point was when we started seeing that–
we visualized sentences that were being translated into
a bunch of different pairs of languages. And the really
exciting thing was when we saw clusters
of sentences in these different
languages show up together. So if I have a sentence that
I’m translating from Portuguese to Spanish to English
and vice versa, all of those representations
showed up clustered together. And then another sentence
here with all the clusters of all the languages and then– so in other words,
what did we find out? We found out that the system
was not partitioning the space into different languages. The system was coming up
with a unique representation of those multiple languages. So in other words, we
saw the first signs of a universal
language of something that we call interlingua. That was amazing. And so it was
almost as if we had ran this multilingual
system through an MRI, and we’re like, whoa! These are the results this is– the other thing the
visualization allowed us to do was to then look
at neighborhoods that didn’t look
very well resolved– where a little language was
kind of hanging out by itself. Those were translations
that were not good. They were not high quality. So what that tells us is that
the geometry of these spaces is meaningful. And if you have your
neighborhood sort of hanging out in the
periphery by itself, you might want to look at that. You might want to make sure
that you debug your system. So these are kind
of superpowers you can have in understanding
and making things better, making things work better. And for the future,
One of the things I’m really excited about
sort of goes hand in hand with something that Fei-Fei I
think is a true advocate for, is democratizing AI. Visualization, I think–
and other techniques– it’s not only
visualization, but I truly believe that the more
different kinds of people we bring into the fold of
ML, the better off we’re going to be. Right now, AI still feels
very engineering centric, and I’m really curious what
will happen when we bring in designers, UXers,
[? scientists ?] [? says ?] we’re starting to bring in. What are the different
possibilities? What are the different solutions
that we haven’t even thought about that we can
then start exploring? DIANE GREENE: Thank you. [APPLAUSE] Francoise, I feel
like I should ask you how’s data visualization going
to help speech recognition, but I also wanted to ask
you about as data gets more complex, you know, we’ve
had all this labeled data for the training models and
we do more personalization, where’s the technology
going and what challenges are you excited about? FRANCOISE BEAUFAYS: Yeah, it’s
actually very interesting. Each time we jump into a new
problem in speech recognition, we really have to
focus on it in a sense, you know, when we start
working on YouTube Kids, for example, which was a
YouTube space for children, we really had to focus
on those young voices. They don’t speak
the same way we do, they don’t have the
same pitch range, they don’t have the same
way of chopping words. They take these deep
breaths, and then they have a burst of speech. So we really had to focus on it. And then eventually,
we found a way of folding back that learning
into our generic models so that Google
Home, for example, would work with your children
as well as it does with you. But Google Home itself was
also a new environment where we had to collect new data. And when that data
is available to you, then it’s easy to fold it into
the models and keep retraining. But the first time you want to
launch a Google Home device, you don’t have it, right? And so we did a
lot of simulations taking data, adding
noise of different types, doing different types of
reverberation on the data, and indeed, we use
massive amounts of data. We transcribe tens of
thousands of hours of speech and then we multiply it
with the simulations. If you do the math, it
averages to something like a handful of
centuries of speech that we can fold into a model,
so just massive amounts. And I think it’s
very interesting to think about how that
scales to more problems with different acoustic
characteristics, but also to different languages. If I can ask you guys, like
you know how many of you speak another
language than English? Right, you see? All the hands are raising. So we really want to make our
technology available to all of you in your own language. And if you think of it,
it’s a massive problem. How are we going to do that? Are we going to build
one recognizer that works for all of you? Are we going to do like we
do now, one per language? Well, how about dialects, then? And how are we
going to do when we have a language that’s a fairly
small one with small pocket of individuals? So if you ask linguists,
they will tell you that there are 6,000, 7,000
languages in the world. They will tell you that there
are about 1,000 of them– actually, 1,342, they say,
that have more than 100,000 speakers. So that’s a lot, right? And if we want to really go
into deep internationalization and serve all the languages that
have big populations on earth, it’s going to require
a lot of creativity on the machine learning
side to manage to share data among languages, learn
from other languages, and so on So I think
it’s really exciting and there is still a ton of
work to do in that domain. DIANE GREENE: I agree. It’s a very exciting. Thank you. [APPLAUSE] We’re going to go a
little over, so we have time to hear about the
future from Fei-Fei and Daphne. Fei-Fei, what excites you about
what’s possible going forward? FEI-FEI LI: What
excites me about what’s possible going forward? Let me just say one dimension. I generally believe AI
is one of the driving forces of the fourth
industrial revolution. It’s just the beginning,
but it has the potential to transform the way humans
live, work, and communicate. And one favorite line I
heard from a philosopher is, there’s no independent
machine values. Machine values are human values. So one thing that
really excites me is to include the
diverse technologists in the field of AI to
build the future together. Because once we have that
diversity of representation in the field of
AI technology, we will build the technology that
is for the entire humanity, not just a slice of it. DIANE GREENE: Yes, you
have a lot of credibility when you say that, Fei-Fei. [APPLAUSE] And Daphne, the
intersection of biology, and computing, and everything
else you’ve done, what– DAPHNE KOLLER: Well, when
I look at the progress that machine learning has made
over the last five to 10 years, and as a long time AI
researcher, if you’d asked me even five years
ago, will computers be able to caption
images without any kind of prior knowledge,
just in the same quality that a human would, I would have
said, nehh, maybe in 20 years. And with the work of
Fei-Fei and others, we’ve been able to reach
that milestone way sooner than I would have expected. The reason I moved back
to biology from Coursera is because I think
we’re hitting that knee in the curve in biology. So when you look,
for instance, at some of the current
predictions, there was a paper that was published
in 2015 called “Big Data– Astronomical or Genomical?” And it looks at the number of
human genome sequence, which is a very limited part
of biological data that’s being captured, and you look
at the historical trend, and that amount
doubles every seven months, which makes it about
twice as fast as Moore’s law. So if you look at 2025
and you project that line, the number of human
genome sequence by 2025 will be on the conservative
projection, 100 million, and if you look at
the historical trend, it’ll be two billion. Two billion human
genome sequence, and that’s just sequence. That doesn’t count RNA, and
proteomes, and whole body imaging, and cellular imaging. So we’re at the cusp
of the beginnings, I think, of really understanding
what is the most complex system that we’ve encountered, which
is that of a biological system. What is it that makes us alive? What is it that
forces us to die? And so I think with that amount
of data and the techniques that machine learning has
developed and will continue to develop, we
have an opportunity to really transform
science in this way. And I’m really excited to
be able to bring those two communities together
to make that possible. [APPLAUSE] DIANE GREENE: So there’s
clearly so much more that we could sit
and listen to here. This has just been
a phenomenally interesting and inspiring panel. Thank you very much. FEI-FEI LI: Thank you, Diane. [APPLAUSE]

64 thoughts on “Past, Present and Future of AI / Machine Learning (Google I/O ’17)

  1. That lady in the black and white is super super smart. I mean they all are but she seem to have a big vision of the past impacts to the future impacts of AI.

  2. We so need people like the woman from calicolabs, its the ability to tackle a problem from such a critical viewpoint where she actually said, and I'm paraphrasing, age seems to be the number one factor in your demise, there is diseases / many other risk including like death from whatever as we get older. Later, she moved to how she wanted to tackle making your life better as you age… 😉

  3. Hopefully, this whole A.I. stuff will in some extents satisfy the expectation of some investors. Otherwise, we will have another A.I. winter just like what has happened in the 1970s. Machine learning is nothing new, we just have better computational power and more data right now.

  4. The talk is about a bunch of machine learning and deep learning stuff. But i didn't get it, why the future of A.I. ? Where are the A.I. topics in this talk?

  5. 12:20 – Yeah right voice search… Didn't came out year after Apple released Siri (made by company that they bought tho).

  6. AI in Google is dominated by ladies? Good news. Seems that our world will be less violent when we are dominated by AI.

  7. Having been in the biz 40+ years, I welcome experiences that disrupt my gender biases, such as this video. Thanks.

  8. To any one stating a lack of "facts" in this Talk. The underlying knowledge through the board here is huge!
    Especially the short speech from fernanda from 31:00 on is eyeopening, literally.
    keep it up!

  9. Google I/O: "We'll live longer & better." -> little applause; "We're supporting Kotlin programming language for Android dev." -> huge applause.

  10. Collecting extremely large amounts of data and finding consistencies and inconsistencies within completely unrelated systems (macro and micro) is the greatest contribution that A.I. can make towards understanding the universe, improving technology as a result.

  11. the speech reco expert clearly lives in a world different from mine. I mean indeed speech is very important in daily lives, but really not very much of that is using artificial intelligence (most speech is still processed by humans).

  12. Most of this stuff has been around for decades, but only recently has anyone figured out how to put them all together. Now I can command my phone to "show me pictures of cats", and I see pictures of cats. Perhaps we have reached the summit of human achievement and any future accomplishment will be a disappointment.

  13. I know I'm not smart but what's the difference between massive if/ else statements and AI. It's a bigger decision tree in simple terms right?

  14. Data from all artificial brain machines will be disseminated to other artificial brain machines to be synchronized and then reassembled for processing by the computer, then distributed again to all artificial brain machines and so on …… this way is expected to extend the life of humans . Why not just developed an artificial immune system or so-called human body's defense system.

  15. I think it was smart to only build an AI that would never have an access to the Internet. One would only manually upload information to it, give it a task and wait for it to create various solutions from which people could pick up the most appropriate ones for the fulfilling of the required task. The AI should also have no ability to create or do anything besides creating ideas. it should also not have an understanding of itself as an individual. In other words, it should only be an isolated machine that can be used in solving scientific problems.

  16. All panelists female even tho men enroll in STEM much more than women. I'm sure that happened through fair competition and pro-female bias had nothing to do with it.

  17. I am so excited about how we've developed X in the past, it has a huge role to play in developing Y, it is a very interesting time with so many challenges overcome yet so many new ones blah blah blah blah.. replace X and Y with AI, robotics, machine learning, speech recognition etc. I expected more from top google scientists. I must say that this could be better done by a simple bot just repeating the same statements in different configurations without much facts or information. The topic deserves much better coverage then served by this panel.

  18. where are the white males? Driven from google by the diversity initiative? or down in the engine rooms of google writing the actual code?

  19. interested in AI and MACHINE LEARNING visit my yech blog to get a simple and lucid intro adstark1.blogspot.com and must put ur views too

  20. 25:55 It's so mind blowing to me how much of research is motivated by nature. It seems really obvious, but it's not like I look at my phone throughout the day and think "gee, I'm able to unlock my phone with my face because of insights scientists gleaned from nature"

  21. DATA is already in sequence we need a purpose to pick up right data at right time. The purpose is to make human life better, faster and comfortable.

  22. Yes I love this! Power to the women! This is what little girls need to look up to not Kim Kardashian and Barbie dolls! Do something meaniful in your life and stop looking in the mirrior every 5 minutes! This is what women should look like cause we are human beings not objects for men to play with! So use your brain girls and have some self respect! Listen to these women they are where we need to be.

  23. These are excellent AI researchers. But Google's Deepmind leaders are at a higher level than anybody in the field of AI, IMHO.

  24. At 15:38-40 the subtitles read: "So visualization is one way that you can sort of peak and try to understand what's happening in these systems". The word should be "peek". The speaker actually reinforces what she says by her body language (her hands and arms looking over the edge of something). "peak" makes no sense in the context and you may be able to guess what she means simply by listening to the audio, maybe in combination with her body language, and by ignoring the incorrect subtitles.

    Is an AI system doing the subtitles?

    Yeah, humans too can produce utter garbage in creating subtitles (we could open a channel for some real doozies!) but at least any reasonably educated person can pick up the mistake and make them read sensibly. Does it matter? Yes, it does to people who are deaf and to people who do not know the language being subtitled. English has a lot of homonyms and their misuse gives rise to jokes, and you can still enjoy the opera/film in another language even if mistakes are made, but correct "translation" is incredibly important in international councils like the UN.

    "That is why it was called Babel—because there the Lord confused the language of the whole world. From there the Lord scattered them over the face of the whole earth."

    AI has a long, long way to go to making a "good hash" of speech recognition. What a lot of levels of meaning and punning can be made out of those two little words 🙂

  25. The moderator is in way over her head, isn't she? She seems really nervious and as tho she just bearly understands the language the pannelists are speaking.

  26. I really didn't think it possible, a forty-four minute "lecture" with a content so vacuous, so utterly devoid of useful details. It seems the content is geared towards a pre-teenage audience, a primer if you will, AI 101.
    It is astounding how this group can speak for so long & say so little!

  27. Listening to these ladies felt like drinking water from a fire hose. Their knowledge inspired me to work even harder.

Leave a Reply

Your email address will not be published. Required fields are marked *