Real-time hand gesture recognition and mapping using MediaPipe and OpenCV for intuitive robotic control
Traditional robotic control systems rely on physical interfaces such as joysticks, buttons, or keyboards, which limit natural interaction and accessibility. These methods require specialized training and create barriers for users who need intuitive, contactless control mechanisms.
Physical controllers restrict accessibility for users with mobility challenges
Traditional interfaces require extensive training and practice
Physical controls lack the intuitiveness of natural hand movements
Implement MediaPipe framework to detect and track 21 hand landmarks with low latency using webcam input
Map index fingertip X-axis position to servo angle range (0–180°) using proportional scaling
Establish serial communication with ESP32 microcontroller for reliable servo motor control
Display real-time servo angle on OLED screen for immediate user feedback
The system operates in a continuous loop at 30+ FPS. Each frame is processed through MediaPipe to extract hand landmarks, specifically the index fingertip (landmark 8). The normalized X-coordinate is linearly mapped to a servo angle range of 0-180°. This angle is transmitted via serial UART to the ESP32, which generates the corresponding PWM signal for servo control while simultaneously updating the OLED display with real-time feedback.
Semester 2 focuses on establishing the foundation with computer vision and single-axis servo control. ROS2 and Gazebo integration are planned for Semester 3 to build the software framework and test virtually. Physical robotic arm construction will follow in Semester 4 after validating the control algorithms in simulation. The current system provides a robust proof-of-concept for gesture-based control that will be expanded into a full multi-DOF robotic arm system.