Nvidia Uses Apple Vision Pro To Control Robot Humanoids
JAKARTA - Nvidia has introduced a new control service that allows developers to work on projects involving humanoid robots, which are controlled and monitored using the Apple Vision Pro.
Developing humanoid robots itself currently faces many challenges. One of them is controlling this very technical device. To help in this field, Nvidia has provided a number of tools for robotic simulations, including some that assist in control.
These tools are provided by Nvidia for the main robot manufacturer and software developer. This model and platform suite aims to train a new generation of humanoid robots.
The collection of tools includes what Nvidia calls NIM microservices and frameworks intended for simulation and learning. There is also Nvidia OSMO orchestration services to handle multi-stage robotic workloads, as well as teleoperative workflows supported by AI and simulations.
As part of this workflow, headsets and spatial computing devices like the Apple Vision Pro can be used not only to view data but also to control hardware.
"The next wave of AI is robotics and one of the most interesting developments is humanoid robots," Nvidia CEO and founder Jensen Huang said. "We are developing a whole pile of NVIDIA robotics, opening up access for developers and humanoid companies around the world to use platforms, acceleration libraries, and AI models that best suit their needs."
The NIM microservice is a pre-building container that uses Nvidia's inference software, which is meant to reduce the implementation time. Two of these microservices are designed to assist developers with simulated workflows for a generating physical AI in Nvidia Isaac SIM, a reference application.
One of these microservices, MimicGen NIM, is used to help users control hardware using Apple Vision Pro, or other spatial computing devices. This service generates synthetic movement data for robots based on "recorded teleoperative data," namely translating the movement from Apple Vision Pro into a movement that will be carried out by robots.
Videos and images show that this is more than just moving the camera based on headset motion. It is pointed out that hand movements and signs are also recorded and used, based on the Apple Vision Pro sensor.
Thus, users can see the robot's movements and directly control their hands and arms, all using the Apple Vision Pro.
While such a humanoid robot can try to emulate moves appropriately, a system like Nvidia's can interpret what users want to do. Since users don't have tactile feedback for what robots are holding, it can be too dangerous to emulate hand movements directly.
SEE ALSO:
Other teleoperative workflows demonstrated on Siggraph also allow developers to create large amounts of movement and perception data. All made up of a small number of demonstrations captured remotely by humans.
For this demonstration, the Apple Vision Pro is used to capture a person's hand movements. This movement is then used to simulate recordings using Micro Service MimicGen NIM and Nvidia Isaac Sim, which generate synthetic datasets.
Developers can then train Project Groot's humanoid model with a combination of real and synthetic data. This process is considered to help reduce the cost and time spent creating data from scratch.
"Development of a very complex humanoid robot requires tremendous amounts of real data, which are caught with great difficulty from the real world," said Fourier robotic platform maker CEO Alex Gu. "Generative AI development tools and new Nvidia simulations will help accelerate the workflow of our model development."
Micro services, as well as access to models, OSMO managed robotic services, and other frameworks, are all offered under the Nvidia Humanoid Robot Developer Program. Access is provided by the company only to software developers, hardware, or humanoid robot manufacturers.