A Silicon Valley-based hardware team collaborated with Microsoft
Research to overcome technological hurdles with the new time-of-flight
sensing camera in Xbox One.
Cyrus Bamji had encountered a challenge. Luckily for him, Microsoft Research had just the solution.
Bamji, Microsoft partner hardware architect for Microsoft’s Silicon Valley-based Architecture and Silicon Management group, and members of his team were trying to incorporate a time-of-flight camera into Xbox One, the successor to the wildly popular Xbox 360.
A time-of-flight camera emits light signals and then measures how long it takes them to return. That needs to be accurate to 1/10,000,000,000 of a second—remember, we’re talking the speed of light here. With such measurements, the camera is able to differentiate light reflecting from objects in a room and the surrounding environment. That provides an accurate depth estimation that enables the shape of those objects to be computed.
That speed-of-light capability would be a major advancement for the Kinect sensor portion of Xbox One, being released to 13 launch markets next month. The new Kinect, a key differentiator for Xbox One against its competition, needed to capture a larger field of view with greater accuracy and higher resolution. An infrared sensor will enable object identification requiring little to no light, and improved hand-pose recognition, giving gamers and more casual users the ability to control the console with their hands.
But Cyrus Bamji had a challenge. The sensor was great, but it also left those working on it eager to do even more with it.
“When we take a relatively new technology, such as time-of-flight, and put it into a commercial product, there are a whole bunch of things that happen,” he says. “There are things that we didn’t know how important they were until the product was made. For example, we know theoretically that motion blur in time of flight is a big problem, but just how important is only discoverable when you’re building a product with it and that product needs to deliver an excellent experience.”
Accurate depth measurement in diverse scenes with the new camera’s high resolution and a wider field of view also pose user-experience issues, making it difficult to keep small objects, such as a finger, from fading into the background, for instance. While those features delivered more versatile device performance, they also created issues of their own in real-life scenarios, such as the need for accurate depth measurement in diverse, high-resolution scenes. That, as well as improving the wider field of view and the motion blur, required clean data—quickly. Xbox One had to be ready for the 2013 holiday season.
“We knew our time was limited,” Bamji recalls. “But we also had the advantage of being able to tap into Microsoft Research’s deep reservoir of technical expertise to get expert advice and help solve the various problems we encountered with new, cutting-edge solutions.”
Eyal Krupka, principal applied researcher, Microsoft Research Advanced Technology Lab, was up to the challenge.
Eyal Krupka
“I was in Redmond last summer, working on hand-pose-recognition research, also for Xbox One,” Krupka says, “and Mark Plagge, a principal program-manager lead on the Xbox One team, approached me about the ongoing work in solving some issues with the camera. They had made huge progress, but the progress had not come quickly enough, and there were not any clear solutions yet. He asked me to check to see if I could help.”
Travis Perry, a senior system architect lead with the Architecture and Silicon Management team, says that things took off from there.
“Eyal and I had many meetings discussing the various tradeoffs of the sensor and discussing the problem statements,” says Perry, who worked with Krupka on algorithm and parameter optimization. “Our team supported Eyal and his team with data and software for the existing depth calculations, and Eyal and I worked together to achieve better edge and motion-blur performance.”
That sort of engagement and teamwork was key to the project’s success.
“It wasn’t like consultant mode, where we asked something and the researchers gave us an opinion,” Bamji says. “They really took charge of the project. They did all the tests. They built a whole infrastructure of software to deliver us a complete solution. Essentially, they took charge. We’re really grateful for that.”
Krupka—with a few Microsoft Research colleagues making contributions of their own—worked well with their Xbox partners. Their combined domain knowledge meshed well.
“The reason Eyal and I were successful,” Perry says, “was because of his extensive knowledge of computer vision, signal processing, and machine learning, along with my knowledge of time-of-flight technology and the system tradeoffs, allowing us to make the right decisions in a short amount of time and keep to the tight schedule.”
Krupka also worked diligently to gain a deep understanding of how the system worked.
“Eyal was curious from the beginning to understand how the technology works, the underlying mechanism, and the various noise models that go into the system,” says Sunil Acharya, senior director of engineering for the Architecture and Silicon Management team. “His team was helping the software team with face- and hand-recognition algorithms when he found out about the time-of-flight challenges we were facing. He jumped in and worked with us very closely, and his team started working on solutions that mapped directly into the product timeline.”
For Bamji, it was an a-ha moment.
“We had researchers who understood that time was of the essence,” he says. “We could ask them about a problem, and they would get on it and essentially come up with solutions that were technically challenging, but not in a vacuum. And they delivered solutions in a timeframe that was something that could be of use to us.
“The success story is a rapid response and the solving of difficult problems.”
That is a concise summation of the value of Microsoft Research, a unique asset to Microsoft developers of devices and services. For Krupka, this is significant.
“The research aspects of what we delivered for Xbox One did not start on the day we start working with the product team,” he says. “It starts years before we learn about any specific project or problem. It is based on accumulating a wide range of research expertise through exploration on multiple research projects, accumulating engineering and research tools and practices—including rapid research methods.
“This is achieved by rotating cycles of working on long-term research problems, then switching to short-term research tasks. This is critical to the success. If we did only short-term, on-demand research, we couldn’t have the critical assets when we work on the product’s problems. If we worked only on long-term research, we would have had a harder time switching gears to deliver solutions on a product group’s timeline.”
The analog nature of the time-of-flight data posed challenges to delivering such a solution.
“The time-of-flight data coming out of our sensor is per pixel, per frame, and there is a lot more analog information,” Acharya says. “Another issue was that the foreground objects close to the background objects would melt into the background—again, due to the analog nature of how our sensor provides the depth data for pixels that land on edges.”
“This resulted in a lot of information, and to make it easier for foreground/background extraction and scene segmentation, use by software and game developers, the requirement was to clean up this data simultaneously by adding software algorithms in the pipe, yet without incurring a performance hit. This was crucial. We started with various work streams and, in the end, settled on making optimization to the parameters in the system to overcome the issue.”
The collaborators wanted to deliver a clear separation of foreground and background even if the objects are close to each other. That, too, proved difficult. And then there was motion blur.
Travis Perry (left) and Sunil Acharya
“Motion blur,” Acharya explains, “is a parameter that needs to be minimized and is not technology-specific. The time-of-flight camera uses global shutter, which has helped reduce motion blur significantly—from 65 milliseconds in the original Kinect to fewer than 14 milliseconds now.”
Other challenges presented themselves. For one thing, processing time became an issue. In the academic literature about time-of-flight systems, processing time wasn’t an issue. In the laboratory environment, the technology worked fine. But Xbox One needs to process a whopping 6.5 million pixels per second. And only a small part of Xbox One’s computing power could be harnessed for this task. The lion’s share is reserved, understandably, for essentials such as gaming, skeleton tracking, face recognition, and audio.
“You need to do very, very light computation for each pixel,” Krupka says, “and this is one of the things that made the problem challenging and different from the typical approach in the academic literature in this field.”
And then there was that tight timeframe for delivery. Benchmark numbers in multiple domains had to be attained to make the Xbox One perform at the highest level.
“The fact that we needed to hit all the benchmark performance numbers in all these multiple, different domains was a challenge,” Bamji confirms. “Many of these things were heavy-duty theoretical stuff. That’s why we reached out to Microsoft Research and asked for their help.”
Remarkably, it all came together, and...Read the rest of this post --->
Cyrus Bamji had encountered a challenge. Luckily for him, Microsoft Research had just the solution.
Bamji, Microsoft partner hardware architect for Microsoft’s Silicon Valley-based Architecture and Silicon Management group, and members of his team were trying to incorporate a time-of-flight camera into Xbox One, the successor to the wildly popular Xbox 360.
A time-of-flight camera emits light signals and then measures how long it takes them to return. That needs to be accurate to 1/10,000,000,000 of a second—remember, we’re talking the speed of light here. With such measurements, the camera is able to differentiate light reflecting from objects in a room and the surrounding environment. That provides an accurate depth estimation that enables the shape of those objects to be computed.
That speed-of-light capability would be a major advancement for the Kinect sensor portion of Xbox One, being released to 13 launch markets next month. The new Kinect, a key differentiator for Xbox One against its competition, needed to capture a larger field of view with greater accuracy and higher resolution. An infrared sensor will enable object identification requiring little to no light, and improved hand-pose recognition, giving gamers and more casual users the ability to control the console with their hands.
But Cyrus Bamji had a challenge. The sensor was great, but it also left those working on it eager to do even more with it.
“When we take a relatively new technology, such as time-of-flight, and put it into a commercial product, there are a whole bunch of things that happen,” he says. “There are things that we didn’t know how important they were until the product was made. For example, we know theoretically that motion blur in time of flight is a big problem, but just how important is only discoverable when you’re building a product with it and that product needs to deliver an excellent experience.”
Accurate depth measurement in diverse scenes with the new camera’s high resolution and a wider field of view also pose user-experience issues, making it difficult to keep small objects, such as a finger, from fading into the background, for instance. While those features delivered more versatile device performance, they also created issues of their own in real-life scenarios, such as the need for accurate depth measurement in diverse, high-resolution scenes. That, as well as improving the wider field of view and the motion blur, required clean data—quickly. Xbox One had to be ready for the 2013 holiday season.
“We knew our time was limited,” Bamji recalls. “But we also had the advantage of being able to tap into Microsoft Research’s deep reservoir of technical expertise to get expert advice and help solve the various problems we encountered with new, cutting-edge solutions.”
Eyal Krupka, principal applied researcher, Microsoft Research Advanced Technology Lab, was up to the challenge.
Eyal Krupka
“I was in Redmond last summer, working on hand-pose-recognition research, also for Xbox One,” Krupka says, “and Mark Plagge, a principal program-manager lead on the Xbox One team, approached me about the ongoing work in solving some issues with the camera. They had made huge progress, but the progress had not come quickly enough, and there were not any clear solutions yet. He asked me to check to see if I could help.”
Travis Perry, a senior system architect lead with the Architecture and Silicon Management team, says that things took off from there.
“Eyal and I had many meetings discussing the various tradeoffs of the sensor and discussing the problem statements,” says Perry, who worked with Krupka on algorithm and parameter optimization. “Our team supported Eyal and his team with data and software for the existing depth calculations, and Eyal and I worked together to achieve better edge and motion-blur performance.”
That sort of engagement and teamwork was key to the project’s success.
“It wasn’t like consultant mode, where we asked something and the researchers gave us an opinion,” Bamji says. “They really took charge of the project. They did all the tests. They built a whole infrastructure of software to deliver us a complete solution. Essentially, they took charge. We’re really grateful for that.”
Krupka—with a few Microsoft Research colleagues making contributions of their own—worked well with their Xbox partners. Their combined domain knowledge meshed well.
“The reason Eyal and I were successful,” Perry says, “was because of his extensive knowledge of computer vision, signal processing, and machine learning, along with my knowledge of time-of-flight technology and the system tradeoffs, allowing us to make the right decisions in a short amount of time and keep to the tight schedule.”
Krupka also worked diligently to gain a deep understanding of how the system worked.
“Eyal was curious from the beginning to understand how the technology works, the underlying mechanism, and the various noise models that go into the system,” says Sunil Acharya, senior director of engineering for the Architecture and Silicon Management team. “His team was helping the software team with face- and hand-recognition algorithms when he found out about the time-of-flight challenges we were facing. He jumped in and worked with us very closely, and his team started working on solutions that mapped directly into the product timeline.”
For Bamji, it was an a-ha moment.
“We had researchers who understood that time was of the essence,” he says. “We could ask them about a problem, and they would get on it and essentially come up with solutions that were technically challenging, but not in a vacuum. And they delivered solutions in a timeframe that was something that could be of use to us.
“The success story is a rapid response and the solving of difficult problems.”
That is a concise summation of the value of Microsoft Research, a unique asset to Microsoft developers of devices and services. For Krupka, this is significant.
“The research aspects of what we delivered for Xbox One did not start on the day we start working with the product team,” he says. “It starts years before we learn about any specific project or problem. It is based on accumulating a wide range of research expertise through exploration on multiple research projects, accumulating engineering and research tools and practices—including rapid research methods.
“This is achieved by rotating cycles of working on long-term research problems, then switching to short-term research tasks. This is critical to the success. If we did only short-term, on-demand research, we couldn’t have the critical assets when we work on the product’s problems. If we worked only on long-term research, we would have had a harder time switching gears to deliver solutions on a product group’s timeline.”
The analog nature of the time-of-flight data posed challenges to delivering such a solution.
“The time-of-flight data coming out of our sensor is per pixel, per frame, and there is a lot more analog information,” Acharya says. “Another issue was that the foreground objects close to the background objects would melt into the background—again, due to the analog nature of how our sensor provides the depth data for pixels that land on edges.”
“This resulted in a lot of information, and to make it easier for foreground/background extraction and scene segmentation, use by software and game developers, the requirement was to clean up this data simultaneously by adding software algorithms in the pipe, yet without incurring a performance hit. This was crucial. We started with various work streams and, in the end, settled on making optimization to the parameters in the system to overcome the issue.”
The collaborators wanted to deliver a clear separation of foreground and background even if the objects are close to each other. That, too, proved difficult. And then there was motion blur.
Travis Perry (left) and Sunil Acharya
“Motion blur,” Acharya explains, “is a parameter that needs to be minimized and is not technology-specific. The time-of-flight camera uses global shutter, which has helped reduce motion blur significantly—from 65 milliseconds in the original Kinect to fewer than 14 milliseconds now.”
Other challenges presented themselves. For one thing, processing time became an issue. In the academic literature about time-of-flight systems, processing time wasn’t an issue. In the laboratory environment, the technology worked fine. But Xbox One needs to process a whopping 6.5 million pixels per second. And only a small part of Xbox One’s computing power could be harnessed for this task. The lion’s share is reserved, understandably, for essentials such as gaming, skeleton tracking, face recognition, and audio.
“You need to do very, very light computation for each pixel,” Krupka says, “and this is one of the things that made the problem challenging and different from the typical approach in the academic literature in this field.”
And then there was that tight timeframe for delivery. Benchmark numbers in multiple domains had to be attained to make the Xbox One perform at the highest level.
“The fact that we needed to hit all the benchmark performance numbers in all these multiple, different domains was a challenge,” Bamji confirms. “Many of these things were heavy-duty theoretical stuff. That’s why we reached out to Microsoft Research and asked for their help.”
Remarkably, it all came together, and...Read the rest of this post --->
0 comentarii:
Post a Comment