Intel’s Photorealistic Simulator for Embodied AI

SPEAR, a new AI training framework, could make household robots more useful.

SPEAR enables embodied AI developers to operate robots in complex indoor environments through simulations. (Source: Intel.)

SPEAR enables embodied AI developers to operate robots in complex indoor environments through simulations. (Source: Intel.)

In an effort to develop a better way to train AI models for robotics, Intel has developed the Simulator for Photorealistic Embodied AI Research, or SPEAR.

Developed in collaboration with the Computer Vision Center in Spain, Chinese design platform provider Kujiale, and the Technical University of Munich, SPEAR is an open-source platform for realistic simulations that accelerates the training and validation of embodied AI systems in indoor areas and for digital twin applications. The platform is designed to boost the innovation and commercialization of household robotics, such as robotic vacuums or humanoid robots, by simulating various human-robot interaction scenarios.

Intel’s approach seeks to broaden the scope of existing interactive simulators that are limited in the diversity of interaction content they cover, their physical interaction capabilities and visual fidelity. SPEAR’s goal is to allow developers to train and validate embodied AI agents, or AI-based robots for a wider range of tasks and environments. Although the platform’s current content pack is suited for home environments, Intel plans to release additional content for industrial and health care spheres soon.

SPEAR required a team of professional artists to spend more than a year constructing a series of interactive environments. The platform’s starter pack features more than 300 virtual indoor environments, with over 2,500 rooms and 17,000 objects that can be individually manipulated. The environments include detailed geometries, photorealistic materials, and quality lighting representing realistic physical interactions.

SPEAR also supports a Point-Goal Navigation Task that computes a reward-driven navigational goal for an AI agent within an environment. It also supports a Freeform Task that provides data collection about an agent’s performance.

The physical world is complex, dynamic and constantly changing, which can be challenging for robots to effectively navigate. The most salient illustration of this is autonomous vehicles based on AI models that often have difficulty responding to novel, unexpected scenarios that human drivers can intuitively adapt to within seconds. In June, MIT researchers released an open-source photorealistic simulator that deploys realistic environments for AV training.

Although the stakes are lower in indoor home environments, for robotics to have more commercial appeal, they’ll need to function more independently and effectively. Embodied AI agents learn from physical variables. Orchestrating all the likely scenarios is too time-consuming, labor-intensive and high-risk for it to be a practical training method.

SPEAR supports embodied agents like the LoCoBot Agent that’s equipped with a gripper for rearrangement tasks. (Source: Intel.)

SPEAR supports embodied agents like the LoCoBot Agent that’s equipped with a gripper for rearrangement tasks. (Source: Intel.)

One method to achieve this goal is to provide a training environment that’s similar to real-world conditions but without the risks and labor. The simulated environment can help AI algorithms optimize perception functions, manipulation and spatial intelligence. SPEAR developers hope that the outcome will yield faster validation and reduced time to market.

SPEAR’s environments were built as Unreal Engine assets with an additional OpenAI Gym interface that enabled interactions via Python. SPEAR also balances high photorealism with rendering speed to support training complex robot behaviors.

SPEAR currently supports four embodied agents that send real-world observations from camera sensors useful for validation and predicting the way a robot will operate. SPEAR supports the OpenBot Agent, suited for sim-to-real experiments; the Fetch Agent and the LoCoBot Agent, which can use physically realistic grippers; and the Camera Agent, which can be teleported to create images from any angle.

The agents can also provide a sequence of waypoints marking the shortest path to a goal location, as well as GPS and compass observations to optimize navigational algorithms. Additionally, the agents provide pixel-authentic semantic segmentation and depth images, which can be used to correct inaccurate perception.

SPEAR can currently be downloaded as an open-source license through MIT on GitHub.