For over 20 years, Microsoft Research’s labs around the world have
focused on research across a broad spectrum of topics in computer
science. From the start, the organization has invested heavily in
pioneering breakthroughs in machine intelligence, including efforts in
machine learning and big data. In this interview, Distinguished
Scientist Eric Horvitz talks about advances he sees on the horizon, the
influence they will have on your daily life, and how insights from big
data and developing more intelligent software and services will change
the world.
REDMOND, Wash. – Feb. 15, 2013 – At Microsoft Research labs around the world, some very deep thinkers are contemplating big data.
Eric Horvitz
February 15, 2013
Eric Horvitz, distinguished scientist at Microsoft and co-director of Microsoft Research’s Redmond lab.
Download: Web
This includes Eric Horvitz, distinguished scientist at Microsoft
and co-director of Microsoft Research’s Redmond lab, who was recently
elected to the National Academy of Engineering for his work in
“computational mechanisms for decision making under uncertainty and with
bounded resources.”
He sees a future where machines, fueled by large amounts of data,
can become “empowering, lifelong digital companions” who know what you
want or need (be it pizza or medicine), where you want to go (be it
Hawaii or the most traffic-free route to the ball game) and generally
work with a passion on your behalf.
Capturing data, storing it, interpreting it, and leveraging it can
provide insights on small and large scales, and in high-tech and
mainstream fields alike, Horvitz said.
“In today’s world, effective large-scale data analytics for
predictive modeling, visualization, and discovery are becoming central
for success in many areas.”
Microsoft News Center recently spoke to Horvitz about how
Microsoft Research (MSR) is investing time and talent in the area of big
data and machine intelligence, what breakthroughs MSR has made, and his
vision for the future of these fields.
MNC: Why do you think there is such a buzz around big data right now?
Horvitz: Buzzwords arise
for variety of reasons. In this case, I believe a confluence of several
factors led to the popular use of that catchy phrase. One is the data
that’s being collected in unprecedented quantities now on a variety of
fronts, and advances in computer science – in sensing, storage and
networking. Large amounts of data are being collected in part because of
the shift of many human activities to the Web – and that has made it
easy to collect transactions and events of various kinds in stream with
activities. This includes everything from e-commerce to cars driving
over sensors in roads to smartphone services leveraging location data,
to healthcare. In healthcare, the explosion of genomics and the
increasing capture of clinical data in hospitals has brought gigabytes
and terabytes of patient data into databases – and we are in the early
days of biomedical informatics. Storage also has become very
inexpensive compared to what it used to be. We used to talk about maybe
one day having terabytes of data. Now terabytes are something your kids
can carry on a small drive in their pocket as they go to middle school.
On the computational side, there have been advances with computational
procedures we use to harness data for multiple interesting uses – such
as building predictive models from data. As examples, we can leverage
data to make real-time predictions about a computer user’s changing
intentions or interests and learn to recognize someone’s gestures. We
can learn from patient data to predict the likelihood that a patient
will be readmitted after their discharge from a hospital.
MNC: What makes Microsoft Research’s machine learning research unique from others in the field?
Horvitz: Microsoft Research
is well known as an open research lab where we promote research freedom
to publish on our results and advances. That has attracted the best and
the brightest people. Folks at MSR are energized by a stream of
interesting real-world challenges. They also have access to large data
resources – and the tantalizing opportunity to get one’s best ideas into
into the hands of hundreds of millions of people. Our researchers
investigating machine learning are very much part of the larger
community of researchers worldwide pursuing studies in machine
intelligence. Beyond machine learning, this reseach includes machine
perception, automated reasoning and decision making. Machine learning
runs deep in the DNA of Microsoft Research; the area of work was one of a
few early critical priority areas that we invested in.
Today, people doing machine learning research across our labs are a
substantial intellectual force. This includes teams of deep thinkers
working on core principles as well as applications. We have teams of
folks doing machine learning in Redmond, Cambridge, Beijing, Bangalore,
Silicon Valley, New England and New York City. Together, these groups
form one of the largest machine learning efforts in the world.
MNC: What are some ways that MSR machine learning research has found its way into Microsoft products?
Horvitz: Numerous effort s
have found their way into Microsoft products and services. Many of
these successes stem from very close collaborations between people at
MSR and folks on the product teams. As one example, Microsoft Research
did the core work on learning how to rank items. This work led to
Bing’s core methods for ranking search results in response to user
queries. MSR is also well-known for is its work in vision systems –
machines that can see and recognize what they’re seeing – as well as
speech recognition and translation. When you use Bing voice search or
Bing translator, you’re leveraging core MSR machine learning efforts.
Our Cambridge team is well known for methods that learn to
understand how to take an image and to segment and categorize it; this
valuable and innovative work was a critical enabler for the Kinect,
which can identify people and their gestures in a room.
MSR is also known for applying machine learning research in the
field of biomedical informatics and other aspects of clinical
healthcare. In the Redmond lab, we’ve had major efforts in harnessing
and utilizing the large quantities of clinical data coming out of
hospitals now to build predictive models for guiding decision-making in
hospitals. These systems are at work as I speak, in hospitals around
enhancing healthcare. Another application is Bing Maps and Bing
Directions, which provides traffic-sensitive directions for 72 cities in
North America. Bing Directions uses methods from MSR that showed how
we can learn from histories of traffic data how to predict real-time
flows on all streets in a greater city region. Machine learning even
occurs deep in the Windows operating system. MSR teamed with Windows to
develop a real-time prefetching system that runs in Windows 7 and
Windows 8. Windows continues to learn from users about their patterns of
activity and then makes predictions about next actions – making the
operating system even faster.
MNC: What are some goals of this extensive machine intelligence research?
Horvitz: The directions and
goals are broad, from explorations of the basic science of machine
learning to understanding how to best solve particular classes of data
and perform specific tasks. We also explore the development of more
efficient and powerful tools to support the engineering practice of
machine learning. On this front, we’ve been exploring the development
of tools and methods that let non-experts or or semi-experts do a great
job with their own predictive modeling and data analytics. This is a
very, very interesting challenge – to put the power in the hands of end
users – typically, this kind of analytical power has only been in the
hands of machine learning experts and statisticians .
MNC: That sounds like an immense challenge. Where do you start in trying to make machine intelligence available to the masses?
Horvitz: In machine
learning, numerous algorithmic procedures have been developed; each
typically comes with levers and knobs for tuning the methods to the data
and task at hand. Questions arise about which method is best to use for
a particular dataset and learning task. There are also challenges with
cleaning, preparing and anonymizing raw data so it can be easily
processed and analyzed. There are multiple danger zones in machine
learning, and new kinds of tools can help people to specify what it is
they want to learn and how to validate the accuracy of the predictions
made by the models that they build. Then there’s decision making. This
centers on how to guide actions and policies in the world based on
predictions. We’re working to create new kinds of tools that guide data
collection, analysis and testing – and that also provide end users with
insights about visualization and decision making.
MNC: What are some of the other hurdles in the world of machine learning?
Horvitz: One challenge that
we’ve been taking on is machines that can understand and even translate
conversational speech. Sometimes small gains in accuracy have big
implications for the competency of a system. Recently, (MSR Chief
Research Officer) Rick Rashid demonstrated in front of a large audience
in Tianjin, China, the ability to do real-time translation from English
to Mandarin Chinese. He was talking freely and having his speech
translated and then re-rendered in his own voice – he was speaking
Mandarin in real time. That translation pipeline was enabled by several
technologies, but in some ways the most salient and surprising
innovation was a surprising increase in the accuracy of speech
recognition for conversational speech. That’s just happened in the last
couple of years, and was the result of research and experimentation at
MSR on new directions in machine learning.
MNC: So what aspects of big data will Microsoft Research focus on?
Horvitz: There are so many
fun and promising directions. I have to say, it’s really an exciting
opportunity area – and we’re at an exciting time. Looking out at the
longer-term future, I expect that machine learning, and machine
intelligence more broadly, is going to provide us with foundational new
tools for doing scientific research, and that many breakthroughs over
the next few decades will come as a collaboration between people and the
machine learning and reasoning tools. There are opportunities to learn
new things from large amounts of data, including getting to the bottom
of healthcare mysteries by going through data with automated learning
tools – some of which can recognize causality, that A actually causes B.
Another direction is working to weave together a set of
technologies – machine learning, speech recognition, natural language
understanding, machine vision and decision making – to create systems
that act like bright collaborators and that complement human intellect
in new kinds of ways.
On another front, there’s a great deal of opportunity to do new
kinds of search and retrieval on the Web. We’re also applying machine
learning in new ways to pick out signals in large amounts of population
data. For example, in recent work, we’ve developed a way to discover
clues about medication side effects in anonymized search logs. I
believe that data-centric methods will change the world in so many ways,
with influences on health, education, science and commerce.
MNC: If you were to get a bit Jules Verne, what could all of this research mean for the future?
Horvitz: Looking out to the
future, I believe that there’s an opportunity to build systems that
really become empowering, lifelong digital companions that deeply
understand what it is you want to do, where you want to go, what you
want to learn, what you need to do to stay healthy, what your good and
less good at, and that continue to work on your behalf to assist and to
complement you. Work on several fronts is already providing some
foreshadowing wisps of wider possibilities.
MNC: Why did you get into this field?
Horvitz: I have long been
interested in understanding the human mind and my curiosity led me from
biology to physics to the world of information and computation. Beyond
that core pursuit, I’ve come to be excited over the years with applying
principles of learning and decision making in real-world applications
that provide value – while somehow being related to the big questions
about thinking systems. I’ve had a blast working with and alongside
fabulous colleagues on principles and applications. And at a place like
Microsoft Research, we all have this tantalizing “lever” in mind – with a
fulcrum at the horizon. Our next innovation or idea could really move
the planet, via having an influence on Microsoft’s products and
services.
MNC: All in a day’s work, huh?
Horvitz: [Laughing] Exactly. But I’m serious about this, we’re not kidding around.
MNC: The Harvard Business Review has declared the data scientist the new sexiest job.
Horvitz: That’s great. You
might say that, in some ways, computer science and other engineering
fields have suffered over the years in that people making career choices
had been looking for “noble endeavors” – in fields like healthcare and
law. I believe that the computational sciences are becoming the noble
endeavors of our time, because computing enables so many other things
from aerospace to healthcare to science to law to government.
Editor’s note – Feb. 15, 2013 – Several updates were made post publication.
0 comentarii:
Post a Comment