Problem Statement
The ESP32-CAM is great at capturing images but awkward to drive directly (it's resource-constrained and fiddly to also serve a rich UI from). A clean pattern is to split responsibilities: one board owns the user-facing web interface and orchestration, another owns the camera. The challenge is coordinating the two boards reliably over the local network.
Proposed Solution
Two coordinated ESP32 sketches form a simple remote camera. The controller ESP32 hosts a user-friendly web page where you click "Capture Photo"; on click it requests /capture from the ESP32-CAM, which initializes the camera, takes a photo, and returns it as a JPEG — which the controller then displays. Both connect to the same WiFi and talk over HTTP.
Full Solution Details
- Controller (
esp32) — connects to WiFi, runs a web server with a friendly UI; on capture, makes an HTTP request to the camera board and displays the returned image. - Camera (
esp32-cam) — connects to WiFi, initializes the camera, and exposes/capture(returns a JPEG) and/status(diagnostics/readiness) endpoints. - Setup flow — flash the cam (note its IP from serial), set that IP in the controller sketch, flash the controller, open the controller's IP in a browser, capture and view.
Technical Documentation
C++ on the Arduino/ESP32 toolchain, two sketches. Each board runs its own embedded HTTP server; the controller acts as an HTTP client to the camera's server (a clean client/server split across two microcontrollers on one LAN). The ESP32-CAM handles camera initialization and frame capture and serves the JPEG bytes directly over HTTP; the controller handles presentation and orchestration. Readiness/status endpoints make the coordination robust (the controller can check the cam is up before requesting a frame).
Tech Stack
C++, ESP32 (controller), ESP32-CAM (capture), Arduino framework, embedded HTTP servers, WiFi/HTTP on LAN.
System Design
Browser ──► Controller ESP32 (web UI server)
│ click "Capture"
▼ HTTP GET /capture
ESP32-CAM (HTTP server)
│ init camera → capture → JPEG
▼
JPEG bytes ──► Controller displays image
(also GET /status for readiness/diagnostics) · same WiFi LAN
Smart Architectural Decisions
- Separation of concerns across two boards. Letting the constrained ESP32-CAM do only capture, while a second board owns the UI and orchestration, is the right division of labor — each board does what it's good at, over a clean HTTP boundary.
- HTTP as the inter-board contract. Using plain HTTP (capture/status endpoints) for board-to-board communication keeps the coupling simple, debuggable from a browser, and language-agnostic.
- Status/readiness endpoint makes the two-board dance robust rather than fire-and-forget — the controller can confirm the cam is ready.
- Browser as the client means zero custom app — any device on the LAN can use it.
Impacts
A tidy, working IoT example of multi-device coordination and embedded web serving — a clean reference for splitting capture vs. control across ESP32 boards over the local network.
Demonstrated Skills
Embedded C++/ESP32 firmware; ESP32-CAM camera init + JPEG capture; embedded HTTP servers and HTTP client/server coordination between devices; WiFi/LAN networking; pragmatic separation of concerns on constrained hardware.