What is hand tracking in augmented reality?

Hand tracking in augmented reality (AR) is a technology that enables AR systems to recognize and track the movements and positions of a user’s hands in real time. It allows users to interact with virtual objects or manipulate digital content in the AR environment using natural hand gestures without needing physical controllers.

How hand tracking works

Let’s explore how hand tracking works.

Hand pose estimation

  • Hand-tracking systems use computer vision algorithms and machine learning models to estimate a user’s hands’ pose (position and orientation) in the AR scene.

  • These algorithms analyze the depth, color, or infrared data captured by AR cameras to identify the location and shape of the user’s hands.

Gesture recognition

  • Once the hand pose is estimated, gesture recognition algorithms identify the user’s specific hand movements or gestures.

  • Common gestures include grabbing, swiping, pointing, and making signs or symbols.

Benefits of hand tracking in AR

Let’s explore the benefits of hand-tracking in augmented Reality (AR)

Immersive interaction

  • Hand tracking enhances immersion in AR experiences by allowing users to manipulate virtual objects directly with their hands, creating a more natural and intuitive interaction.

No additional hardware

  • Unlike traditional controllers or gloves, hand tracking requires no additional hardware, making it more accessible and user-friendly.

Accessibility

  • Hand tracking makes AR applications more accessible to a broader audience, including those with physical disabilities who may have difficulty using traditional controllers.

Coding example

Here’s a simple JavaScript code snippet using the Three.js library to implement hand tracking in a web-based AR application:

In this code example, we create an AR scene, add a hand-tracking controller, and define a function to handle recognized gestures. The controller emits a ‘gesture-recognized’ event when it detects a gesture, allowing you to respond to specific user actions.

Note: In the output, you’ll see a red dot. Bring your hand in front of the camera; once it tracks your hand, it’ll make a combination of dots. Now, the dots will move right along if you move your hand.

Make sure your browser allows camera access.

  • HTML
  • SCSS
  • JavaScript
Console
Implementing Hand Tracking in Web-Based AR: JavaScript Example with A-Frame

Explanation

  • Lines 1–6: Initialize the Three.js scene, camera, and renderer, and appends the renderer’s DOM element to the body.

  • Lines 8–10: Add ambient lighting to the scene.

  • Lines 12–20: Create 21 spheres representing the hand joints, adds them to the scene, and stores them in the handJoints array.

  • Lines 22–23: Position the camera at a z-coordinate of 5.

  • Lines 25–38: Initialize the MediaPipe Hands model with specific configuration options.

  • Lines 40–49: Set up the video element and initialize the camera to start capturing video frames.

  • Lines 51–64: Define the onResults function to handle the results from MediaPipe Hands.

  • Lines 66–72: Define the animate function, which is the render loop that continuously updates and renders the scene.

Conclusion

Hand tracking in augmented reality (AR) revolutionizes user interaction by enabling real-time recognition and tracking of hand movements without needing physical controllers. Through sophisticated algorithms and machine learning models, hand pose estimation and gesture recognition empower users to manipulate virtual objects intuitively, enhancing immersion and accessibility in AR experiences.

Free Resources

Copyright ©2025 Educative, Inc. All rights reserved