Guide to Computer Vision Pick-and-Place: Coupling OpenCV on a PC with an ESP32 Servo-Driven Arm via Serial
Computer Vision Pick-and-Place
Coupling OpenCV on a PC with an ESP32 Servo-Driven Arm via Serial
Imagine a robotic arm that can see, think, and act — detecting a red ball on a conveyor belt, calculating its position, and lifting it precisely. This is the power of Computer Vision Pick-and-Place: where machine vision meets physical automation.
In this guide, you’ll build a real-world system: an OpenCV-powered desktop application runs on your PC to detect and locate an object, then sends its coordinates via serial port to an ESP32 microcontroller. The ESP32 drives a servo motor to move a gripper and complete the pick-and-place cycle. No exotic hardware — just accessible tools and smart integration.
How It All Fits Together
PC runs detection → serial sends data → ESP32 acts
- Object Detection: OpenCV processes camera input to locate objects by color, shape, or template.
- Coordinate Extraction: The center (x, y) and size of the target object are calculated.
- Serial Communication: Data is encoded and sent over USB serial to the ESP32 at 115200 baud.
- Action Trigger: The ESP32 interprets packets and moves servos to the correct angle.
Hardware You’ll Need
Core Components
- ESP32 Dev Module (e.g., ESP32-DevKitC)
- 2–3 Standard Servos (e.g., SG90)
- Micro USB Cable (for PC ↔ ESP32)
- 5V Power Supply (optional, for external power to servos)
Peripherals
- Webcam (1080p recommended) or Raspberry Pi Camera
- Object for testing (e.g., red ball, printed QR code, or colored tape)
- Breadboard & Jumper Wires
Step 1: Build the Mechanical Arm
You don’t need a complex 6-axis arm — a basic 2-joint arm (base + lift) suffices for this tutorial. Here’s how to assemble a minimal setup:
- Base Servo (horizontal): Attaches to a static base, rotates the upper arm.
- Lift Servo (vertical): Controls the elbow, moving the gripper up/down.
- Gripper: A third servo drives a claw or suction cup (optional; can be omitted for basic demo).
Ensure servos receive adequate 5V power — especially when under load. Use a separate 5V supply or USB power bank, and always connect ground between ESP32 and servo rails.
Step 2: Program the ESP32
The ESP32 waits for serial data, parses object position, and translates it into servo angles.
// serial_gripper.ino
#include <ESP32Servo.h>
// Servo objects: base (horizontal), lift (vertical)
Servo baseServo, liftServo;
// Pins
const int basePin = 13;
const int liftPin = 15;
// Angles (calibrated for your setup)
int baseAngle = 90;
int liftAngle = 120;
// Serial protocol: e.g. "X145,Y210," (simplified)
int x = -1, y = -1;
bool gotX = false, gotY = false;
void setup() {
Serial.begin(115200);
baseServo.attach(basePin);
liftServo.attach(liftPin);
// Initial position: rest
baseServo.write(baseAngle);
liftServo.write(liftAngle);
Serial.println("Ready: waiting for object coordinates (X,Y)");
}
void loop() {
if (Serial.available() > 0) {
String input = Serial.readStringUntil('\n');
input.trim();
// Expect format: "X145,Y210"
if (input.startsWith("X") && input.indexOf("Y") > 0) {
int commaPos = input.indexOf(',');
String xStr = input.substring(1, commaPos);
String yStr = input.substring(commaPos + 1);
x = xStr.toInt();
y = yStr.toInt();
// Safety clamp: assume camera region 0–640 (X), 0–480 (Y)
x = constrain(x, 0, 640);
y = constrain(y, 0, 480);
// Map to servo angles (tune for your geometry)
baseAngle = map(x, 0, 640, 10, 170);
liftAngle = map(y, 0, 480, 160, 40); // inverted Y for lift-up behavior
baseServo.write(baseAngle);
liftServo.write(liftAngle);
// Feedback
Serial.print("Moved: X");
Serial.print(x);
Serial.print(", Y");
Serial.print(y);
Serial.println(" - OK");
}
}
}
map() ranges based on your arm’s physical limits.
Step 3: Build the Vision System (PC)
On your PC, run this lightweight OpenCV pipeline. It assumes a webcam is connected and the ESP32 appears as a serial port (e.g., /dev/ttyUSB0 on Linux/macOS or COM3 on Windows).
import cv2
import serial
import time
# --- CONFIG ---
SERIAL_PORT = '/dev/ttyUSB0' # Windows: 'COM3', macOS: '/dev/tty.usbserial-XXXX'
BAUD_RATE = 115200
WEBCAM_INDEX = 0
# Color threshold (HSV) for object — tune for your target!
# Example: red ball (lower, upper HSV ranges)
HSV_LOWER = (0, 70, 50) # Adjust based on lighting & object color
HSV_UPPER = (10, 255, 255)
# Initialize serial
try:
ser = serial.Serial(SERIAL_PORT, BAUD_RATE, timeout=1)
time.sleep(2) # Wait for ESP32 reset + boot
print("Serial connected.")
except Exception as e:
print(f"Serial error: {e}")
exit()
# Open camera
cap = cv2.VideoCapture(WEBCAM_INDEX)
if not cap.isOpened():
print("Error: Camera not found.")
exit()
print("Starting object detection...")
while True:
ret, frame = cap.read()
if not ret:
break
# Resize for faster processing (optional: adjust)
frame = cv2.resize(frame, (640, 480))
# Convert to HSV for color detection
hsv = cv2.cvtColor(frame, cv2.COLOR_BGR2HSV)
# Threshold to detect object
mask = cv2.inRange(hsv, HSV_LOWER, HSV_UPPER)
# Clean up noise
mask = cv2.erode(mask, None, iterations=2)
mask = cv2.dilate(mask, None, iterations=2)
# Find contours
contours, _ = cv2.findContours(mask, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
target = None
if contours:
# Get largest contour (assume it's the target)
c = max(contours, key=cv2.contourArea)
area = cv2.contourArea(c)
# Minimum area filter (ignore noise)
if area > 100:
# Get bounding box & center
(x, y), radius = cv2.minEnclosingCircle(c)
center = (int(x), int(y))
radius = int(radius)
# Draw circle and center point
cv2.circle(frame, center, 3, (0, 255, 0), -1) # Green dot
cv2.circle(frame, center, radius, (0, 255, 0), 2)
# Output coordinates (X, Y)
target = (int(x), int(y))
# Send data to ESP32 (format: "X145,Y210\n")
if target:
cmd = f"X{int(target[0])},Y{int(target[1])}\n"
ser.write(cmd.encode())
print(f"Sent: {cmd.strip()}")
# Display
cv2.imshow('Pick-and-Place Vision', frame)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
# Cleanup
cap.release()
ser.close()
cv2.destroyAllWindows()
cv2.inRange() using real-time HSV sliders (not shown here for brevity). A well-tuned mask makes or breaks detection accuracy.
Testing & Calibration
- Connect the ESP32 and open your serial monitor. You should see:
Ready: waiting for object coordinates (X,Y). - Run the Python script. A camera window appears — hold up your object.
- Watch the green circle track the target. Check the terminal for lines like
Sent: X145,Y210. - Observe the arm. Does it move? Adjust servos in hardware if the range is too limited.
Common Pitfalls
- Servo jitter? → Add a 0.1 ยตF ceramic capacitor across VCC and GND on each servo.
- ESP32 crashes on boot? → Ensure no high-current devices share power with the board.
- No serial data received? → Check port name (case-sensitive on Linux/macOS) and permissions (e.g.,
sudo usermod -aG dialout $USER).
Next Steps
- Add a gripper servo for true “pick” action.
- Log positions to learn motion paths and reduce drift.
- Deploy YOLOv5 instead of HSV for multi-class recognition.
Bringing It All Together
You now have a functional pick-and-place system: vision on the PC, motion on the microcontroller, and a robust serial handshake in between. This architecture is extendable — swap the arm for a linear actuator, add a conveyor belt, or integrate a PLC for factory automation.
The real magic lies in the synergy — OpenCV’s reliability for perception, ESP32’s responsiveness for execution, and serial communication’s simplicity for connection. Start small, iterate fast, and let the robot learn from your hands.
Ready to build? Download the full code, hardware bill, and calibration checklist here.
Comments
Post a Comment