🍽️ Multiple Object Recognition (Plates)

Detecting repeated circular objects (plates) in images using gradient-based keypoints, clustering, and ellipse fitting.

Role

Algorithm design, porting, and deployment (solo project)

Timeline

2024 · Personal project

Tech

Python, OpenCV, DBSCAN, Gradio (originally C++ + andres-graph)

Live Demo GitHub Repo ← Back to Projects

Plate detector preview with detected ellipses

TL;DR

Detects multiple repeated circular objects (plates) in a single image using a purely geometric, model-based approach.
Uses gradient-based keypoints and similarity scores to identify points that align with a circular object model.
Groups keypoints with DBSCAN, approximating a multicut-style clustering without solving an explicit graph cut.
Fits rotated ellipses to each cluster to produce smooth plate boundaries, exposed via a Gradio demo on Hugging Face Spaces.

Problem & Context

In many practical scenarios, we need to detect an unknown number of repeated objects in an image—for example, plates on a table, coins on a surface, or circular components on a production line. Deep object detectors can solve this, but they require annotated training data and heavier infrastructure.

This project takes a different route: use geometric and gradient-based cues to detect multiple plate-like objects without training a neural network. The method is inspired by multicut optimization for instance segmentation but uses clustering instead of solving a full graph partitioning problem.

Data & Inputs

Input images that contain several plates or plate-like circular objects, typically from a top-down or slightly angled view.
No supervision at inference time: the algorithm runs directly on the image and produces detections.
Preprocessing includes converting to grayscale, smoothing if necessary, and computing image gradients.

The main challenges are handling overlapping plates, perspective distortion, and cluttered backgrounds that may contain edges unrelated to the plates.

Approach & Algorithm

Instead of learning from data, the algorithm relies on a combination of gradient information, geometric constraints, and clustering. The pipeline is:

Gradient-based keypoint detection: compute gradient magnitude and direction; select keypoints on strong edges likely to belong to circular object boundaries.
Object-model similarity: for each keypoint, evaluate how well its gradient direction matches the expected direction for a circular boundary around candidate centers (via dot products or similar measures).
DBSCAN clustering (multicut approximation): represent keypoints in a feature space that captures position and similarity; apply DBSCAN to group keypoints that belong to the same object. This acts as a practical stand-in for solving a full multicut problem on a graph of keypoints.
Ellipse fitting: for each cluster, fit an ellipse (center, axes, rotation) that best approximates the grouped keypoints, giving a smooth outline for each plate.

Results & Qualitative Behavior

Qualitatively, the detector identifies the correct number and location of plates in many scenarios where edges are clear and plates are not heavily occluded. The elliptical fits align well with true plate boundaries and remain stable to small perturbations.

Works well on clean images with distinct plate edges and moderate variation in scale and orientation.
Handles multiple plates without specifying the number of objects in advance.
Failure cases include heavily occluded plates, very low contrast edges, or backgrounds with strong circular patterns that mimic plate edges.

The Gradio-based demo makes it easy to experiment: upload a new image and visually inspect which keypoints and ellipses are produced.

Implementation

Original approach implemented in C++ using OpenCV and andres-graph for graph-based reasoning.
Ported to Python to integrate easily with scientific libraries and web deployment tools.
Built a Gradio interface that allows users to upload images, run detection, and see the resulting ellipses overlaid on the image.

The porting required translating some of the graph-based logic into clustering-based logic better suited to Python’s ecosystem, without losing the core multicut-inspired idea of grouping consistent keypoints.

Challenges & Lessons Learned

DBSCAN parameters (epsilon, min samples) strongly influence the number and shape of detected clusters and required careful tuning.
Gradient-based keypoints can be noisy; preprocessing (smoothing, thresholding) has a big impact on stability.
Porting from C++ to Python is not just about syntax—data structures, performance trade-offs, and library choices need to be reconsidered to keep the implementation clean and usable.

This project was a good exercise in geometric computer vision, unsupervised clustering, and turning a research-style algorithm into a practical tool with a friendly UI.

Links

Live Demo on Hugging Face Spaces · GitHub Repository · Back to all projects