Artificial Intelligence and Virtual Reality: New Experiments at Purdue University

Combining the power of deep learning with predictive motion control.

Researchers at Purdue University are approaching virtual reality with a concept that uses powerful learning algorithms from a “deep learning” software that they are calling DeepHand. Specifically, the research team is addressing the problem of accurate hand tracking in virtual reality and augmented reality and proposing an interesting solution involving neural networks and a multitude of 3D sensors. The thought process behind this experiment makes sense given the increasing importance of powerful and accurate hand tracking in augmented reality and human-computer interfaces.

DeepHand learning about the multitude of hand poses to create better interplay in augmented reality. (Video courtesy of Purdue University).

In both augmented reality and virtual reality, better hand tracking means a better user experience. In real life, hand movements are something that we generally take for granted (i.e. picking up a cup, waving, tapping fingers, pointing), but these movements are very difficult to integrate accurately into augmented reality and virtual reality systems. At Purdue, the researchers are employing DeepHand, which uses deep learning to try to grasp the seemingly infinite complexity of joint angles and contortions of the hand.

Flexion and extension are examples of angular motions in which two axes of a joint are brought closer together or moved further apart. DeepHand replicates this by using a depth-sensing camera to capture reality data about the joint angles of a user’s hand in real time. Customized algorithms interpret hand motions as “feature vectors” that draw upon key changes in the angles of hand motion and turn that data into sets of numbers. These sets all go into a database, and DeepHand selects the sets that fit in best with what the camera sees. Of course, in order for this database to be useful, it has to be huge. And it is. With over 2.5 million hand poses and configurations, DeepHand chooses what the team calls “spatial nearest neighbors” that most closely resemble what the camera sees with an individual user’s hand movements and positions. 

At Purdue with DeepHand. Accurate hand tracking in human-computer interfaces will become increasingly important in developing an accurate digital representation of a user’s hand. (Image courtesy of Purdue University.)

At Purdue with DeepHand. Accurate hand tracking in human-computer interfaces will become increasingly important in developing an accurate digital representation of a user’s hand. (Image courtesy of Purdue University).

DeepHand uses a deep convolutional neural network (CNN) along with multiple activation features to synchronize and collectively estimate the joint angles of the hand from a huge database of synthetic joint angles in real time. The joint angles of the hand being sensed by the camera and coordinated by DeepHand’s selection to its “spatial nearest neighbor” is just one part of the equation for improving hand tracking in virtual reality and augmented reality. But it is an interesting convergence of artificial intelligence and research into making virtual reality and augmented reality more accurate and responsive to the digital representations of your hand in real time.

The research was sponsored in part by the National Science Foundation and Purdue’s School of Mechanical Engineering. If you want to take a deep dive into the research described in this article, be sure to read this fascinating research paper.