Guide to Computer Vision Pick-and-Place: Coupling OpenCV on a PC with an ESP32 Servo-Driven Arm via Serial

Computer Vision Pick-and-Place: Tutorial Guide

Computer Vision Pick-and-Place

Coupling OpenCV on a PC with an ESP32 Servo-Driven Arm via Serial

Imagine a robotic arm that can see, think, and act — detecting a red ball on a conveyor belt, calculating its position, and lifting it precisely. This is the power of Computer Vision Pick-and-Place: where machine vision meets physical automation.

In this guide, you’ll build a real-world system: an OpenCV-powered desktop application runs on your PC to detect and locate an object, then sends its coordinates via serial port to an ESP32 microcontroller. The ESP32 drives a servo motor to move a gripper and complete the pick-and-place cycle. No exotic hardware — just accessible tools and smart integration.

How It All Fits Together

PC runs detection → serial sends data → ESP32 acts

  • Object Detection: OpenCV processes camera input to locate objects by color, shape, or template.
  • Coordinate Extraction: The center (x, y) and size of the target object are calculated.
  • Serial Communication: Data is encoded and sent over USB serial to the ESP32 at 115200 baud.
  • Action Trigger: The ESP32 interprets packets and moves servos to the correct angle.

Hardware You’ll Need

Core Components

  • ESP32 Dev Module (e.g., ESP32-DevKitC)
  • 2–3 Standard Servos (e.g., SG90)
  • Micro USB Cable (for PC ↔ ESP32)
  • 5V Power Supply (optional, for external power to servos)

Peripherals

  • Webcam (1080p recommended) or Raspberry Pi Camera
  • Object for testing (e.g., red ball, printed QR code, or colored tape)
  • Breadboard & Jumper Wires

Step 1: Build the Mechanical Arm

You don’t need a complex 6-axis arm — a basic 2-joint arm (base + lift) suffices for this tutorial. Here’s how to assemble a minimal setup:

  1. Base Servo (horizontal): Attaches to a static base, rotates the upper arm.
  2. Lift Servo (vertical): Controls the elbow, moving the gripper up/down.
  3. Gripper: A third servo drives a claw or suction cup (optional; can be omitted for basic demo).

Ensure servos receive adequate 5V power — especially when under load. Use a separate 5V supply or USB power bank, and always connect ground between ESP32 and servo rails.

Step 2: Program the ESP32

The ESP32 waits for serial data, parses object position, and translates it into servo angles.

// serial_gripper.ino
#include <ESP32Servo.h>

// Servo objects: base (horizontal), lift (vertical)
Servo baseServo, liftServo;

// Pins
const int basePin = 13;
const int liftPin = 15;

// Angles (calibrated for your setup)
int baseAngle = 90;
int liftAngle = 120;

// Serial protocol: e.g. "X145,Y210," (simplified)
int x = -1, y = -1;
bool gotX = false, gotY = false;

void setup() {
  Serial.begin(115200);
  baseServo.attach(basePin);
  liftServo.attach(liftPin);
  
  // Initial position: rest
  baseServo.write(baseAngle);
  liftServo.write(liftAngle);
  Serial.println("Ready: waiting for object coordinates (X,Y)");
}

void loop() {
  if (Serial.available() > 0) {
    String input = Serial.readStringUntil('\n');
    input.trim();
    
    // Expect format: "X145,Y210"
    if (input.startsWith("X") && input.indexOf("Y") > 0) {
      int commaPos = input.indexOf(',');
      String xStr = input.substring(1, commaPos);
      String yStr = input.substring(commaPos + 1);
      
      x = xStr.toInt();
      y = yStr.toInt();
      
      // Safety clamp: assume camera region 0–640 (X), 0–480 (Y)
      x = constrain(x, 0, 640);
      y = constrain(y, 0, 480);
      
      // Map to servo angles (tune for your geometry)
      baseAngle = map(x, 0, 640, 10, 170);
      liftAngle = map(y, 0, 480, 160, 40); // inverted Y for lift-up behavior
      
      baseServo.write(baseAngle);
      liftServo.write(liftAngle);
      
      // Feedback
      Serial.print("Moved: X");
      Serial.print(x);
      Serial.print(", Y");
      Serial.print(y);
      Serial.println(" - OK");
    }
  }
}
Pro Tip: Calibrate servo angles by physically moving the arm and testing positions. Adjust the map() ranges based on your arm’s physical limits.

Step 3: Build the Vision System (PC)

On your PC, run this lightweight OpenCV pipeline. It assumes a webcam is connected and the ESP32 appears as a serial port (e.g., /dev/ttyUSB0 on Linux/macOS or COM3 on Windows).

import cv2
import serial
import time

# --- CONFIG ---
SERIAL_PORT = '/dev/ttyUSB0'   # Windows: 'COM3', macOS: '/dev/tty.usbserial-XXXX'
BAUD_RATE = 115200
WEBCAM_INDEX = 0

# Color threshold (HSV) for object — tune for your target!
# Example: red ball (lower, upper HSV ranges)
HSV_LOWER = (0, 70, 50)    # Adjust based on lighting & object color
HSV_UPPER = (10, 255, 255)

# Initialize serial
try:
    ser = serial.Serial(SERIAL_PORT, BAUD_RATE, timeout=1)
    time.sleep(2)  # Wait for ESP32 reset + boot
    print("Serial connected.")
except Exception as e:
    print(f"Serial error: {e}")
    exit()

# Open camera
cap = cv2.VideoCapture(WEBCAM_INDEX)
if not cap.isOpened():
    print("Error: Camera not found.")
    exit()

print("Starting object detection...")

while True:
    ret, frame = cap.read()
    if not ret:
        break
    
    # Resize for faster processing (optional: adjust)
    frame = cv2.resize(frame, (640, 480))
    
    # Convert to HSV for color detection
    hsv = cv2.cvtColor(frame, cv2.COLOR_BGR2HSV)
    
    # Threshold to detect object
    mask = cv2.inRange(hsv, HSV_LOWER, HSV_UPPER)
    
    # Clean up noise
    mask = cv2.erode(mask, None, iterations=2)
    mask = cv2.dilate(mask, None, iterations=2)
    
    # Find contours
    contours, _ = cv2.findContours(mask, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
    
    target = None
    
    if contours:
        # Get largest contour (assume it's the target)
        c = max(contours, key=cv2.contourArea)
        area = cv2.contourArea(c)
        
        # Minimum area filter (ignore noise)
        if area > 100:
            # Get bounding box & center
            (x, y), radius = cv2.minEnclosingCircle(c)
            center = (int(x), int(y))
            radius = int(radius)
            
            # Draw circle and center point
            cv2.circle(frame, center, 3, (0, 255, 0), -1)  # Green dot
            cv2.circle(frame, center, radius, (0, 255, 0), 2)
            
            # Output coordinates (X, Y)
            target = (int(x), int(y))
            
            # Send data to ESP32 (format: "X145,Y210\n")
            if target:
                cmd = f"X{int(target[0])},Y{int(target[1])}\n"
                ser.write(cmd.encode())
                print(f"Sent: {cmd.strip()}")
    
    # Display
    cv2.imshow('Pick-and-Place Vision', frame)
    if cv2.waitKey(1) & 0xFF == ord('q'):
        break

# Cleanup
cap.release()
ser.close()
cv2.destroyAllWindows()
Pro Tip: Test color thresholds with cv2.inRange() using real-time HSV sliders (not shown here for brevity). A well-tuned mask makes or breaks detection accuracy.

Testing & Calibration

  1. Connect the ESP32 and open your serial monitor. You should see: Ready: waiting for object coordinates (X,Y).
  2. Run the Python script. A camera window appears — hold up your object.
  3. Watch the green circle track the target. Check the terminal for lines like Sent: X145,Y210.
  4. Observe the arm. Does it move? Adjust servos in hardware if the range is too limited.

Common Pitfalls

  • Servo jitter? → Add a 0.1 ยตF ceramic capacitor across VCC and GND on each servo.
  • ESP32 crashes on boot? → Ensure no high-current devices share power with the board.
  • No serial data received? → Check port name (case-sensitive on Linux/macOS) and permissions (e.g., sudo usermod -aG dialout $USER).

Next Steps

  • Add a gripper servo for true “pick” action.
  • Log positions to learn motion paths and reduce drift.
  • Deploy YOLOv5 instead of HSV for multi-class recognition.

Bringing It All Together

You now have a functional pick-and-place system: vision on the PC, motion on the microcontroller, and a robust serial handshake in between. This architecture is extendable — swap the arm for a linear actuator, add a conveyor belt, or integrate a PLC for factory automation.

The real magic lies in the synergy — OpenCV’s reliability for perception, ESP32’s responsiveness for execution, and serial communication’s simplicity for connection. Start small, iterate fast, and let the robot learn from your hands.

Ready to build? Download the full code, hardware bill, and calibration checklist here.

© 2024 Robotics & Automation Lab | Built with clarity, color, and code.

Comments

Popular posts from this blog

Guide to ROS2 MoveIt2 Integration for an Open-Source 3D-Printed Robotic Arm and Raspberry Pi

Guide to Voice-Activated Desktop Assistant: Integrating an Offline Speech Recognition Module with an STM32 Robotic Arm