10:00
Semester 2 PBL Project

Gesture-Controlled
Robotic Arm

Real-time hand gesture recognition and mapping using MediaPipe and OpenCV for intuitive robotic control

Technology Stack
Python • OpenCV • MediaPipe
Hardware
ESP32 • Servo Motor • OLED
01

Problem Statement

Identifying the Challenge

Traditional robotic control systems rely on physical interfaces such as joysticks, buttons, or keyboards, which limit natural interaction and accessibility. These methods require specialized training and create barriers for users who need intuitive, contactless control mechanisms.

Limited Accessibility

Physical controllers restrict accessibility for users with mobility challenges

Steep Learning Curve

Traditional interfaces require extensive training and practice

Unnatural Interaction

Physical controls lack the intuitiveness of natural hand movements

02

Project Objectives

Goals and Targets

Real-Time Hand Tracking

Implement MediaPipe framework to detect and track 21 hand landmarks with low latency using webcam input

Gesture-to-Motion Mapping

Map index fingertip X-axis position to servo angle range (0–180°) using proportional scaling

Hardware Integration

Establish serial communication with ESP32 microcontroller for reliable servo motor control

Visual Feedback System

Display real-time servo angle on OLED screen for immediate user feedback

03

Methodology

Technical Approach

Computer Vision Pipeline

1
Video Capture
OpenCV captures real-time webcam feed at 30+ FPS
2
Hand Detection
MediaPipe identifies hand presence in frame
3
Landmark Extraction
Extract 21 3D landmarks including index fingertip
4
Coordinate Normalization
Normalize coordinates to frame dimensions

Control System

1
Proportional Mapping
Map X-coordinate (0-1) to servo angle (0-180°)
2
Serial Communication
Transmit angle data to ESP32 via UART
3
PWM Signal Generation
ESP32 converts angle to PWM for servo control
4
OLED Feedback
Display current angle on 128x64 OLED display

Core Technologies

MediaPipe
Hand Landmark Detection
OpenCV
Video Processing
Python
Core Programming Language
04

System Workflow

End-to-End Pipeline
Webcam
Video Input
OpenCV
Frame Processing
MediaPipe
Hand Detection
Mapping
X → Angle
Serial
Data Transfer
ESP32
Microcontroller
Servo Motor
Physical Actuation
OLED
Visual Feedback

Pipeline Details

The system operates in a continuous loop at 30+ FPS. Each frame is processed through MediaPipe to extract hand landmarks, specifically the index fingertip (landmark 8). The normalized X-coordinate is linearly mapped to a servo angle range of 0-180°. This angle is transmitted via serial UART to the ESP32, which generates the corresponding PWM signal for servo control while simultaneously updating the OLED display with real-time feedback.

05

Current Progress

Semester 2 Achievements
✓ Phase 1 Complete

Vision System Operational

  • MediaPipe hand tracking successfully integrated
  • Real-time landmark detection achieving <50ms latency
  • Index fingertip tracking with 95%+ accuracy

Hardware Integration Complete

  • Serial communication established with ESP32
  • Servo motor responding to gesture input
  • OLED displaying live angle feedback (0-180°)

Performance Metrics

Processing Speed 30+ FPS
Tracking Accuracy 95%
Response Latency <50ms

Technical Implementation

  • Python-based control system with modular architecture
  • Proportional mapping algorithm implemented
  • ESP32 firmware for PWM generation and OLED control
06

Results & Validation

Performance Analysis

System Performance

Key Achievements

Real-Time Operation
Consistent 30+ FPS processing with <50ms latency
Precise Control
Smooth servo movement across full 180° range
Visual Feedback
Live OLED display provides immediate user confirmation

Testing Outcomes

100%
Serial Communication Success
95%
Gesture Recognition Rate
0-180°
Full Servo Range Achieved
<50ms
End-to-End Latency
07

Future Work

Roadmap Ahead
SEMESTER 3

ROS2 Integration

  • Implement ROS2 nodes for gesture and control
  • Develop inverse kinematics solver
  • Create custom message types for hand data
  • Build trajectory planning module
SEMESTER 3

Gazebo Simulation

  • Create URDF model of robotic arm
  • Set up Gazebo physics simulation environment
  • Test gesture control in virtual environment
  • Validate before hardware deployment
SEMESTER 4

Robotic Arm Hardware

  • Design and fabricate multi-DOF robotic arm
  • Integrate multiple servo motors for joints
  • Implement end-effector gripper mechanism
  • Expand gesture mapping to 3D control (X, Y, Z)

Additional Enhancements Planned

Implement multiple gesture recognition for complex commands
Add collision detection and safety constraints
Develop web-based monitoring dashboard
Explore integration with AI-based path optimization

Current Development Status

Semester 2 focuses on establishing the foundation with computer vision and single-axis servo control. ROS2 and Gazebo integration are planned for Semester 3 to build the software framework and test virtually. Physical robotic arm construction will follow in Semester 4 after validating the control algorithms in simulation. The current system provides a robust proof-of-concept for gesture-based control that will be expanded into a full multi-DOF robotic arm system.

Project Team

B.Tech Computer Science & Engineering
Project Guide

Dr. Rishi Gupta

Faculty Supervisor
Project Builder

Vishwas Patel

Reg. No: 2427030557
Academic Year
2025 - 2026