Aurum Quanta Vision.
Object detection, inspection, and OCR for everything that isn't a document.
Cameras already running on your manufacturing line, retail floor, or logistics depot can do more than record. Object detection, image classification, visual quality inspection, OCR for everything that isn't a document, all running on the existing video feed.
Deployed at the edge or in your cloud account, depending on the latency you need. We evaluate on your actual production images. Vendor benchmarks tend to flatter the model, and lighting and camera angles in production will tell a different story almost every time.
From pixels to features.
A convolutional network learns visual features in layers - edges, textures, shapes, then objects. Each layer of filters compresses the image into something the next layer can reason about. Deep learning, done well, is a hierarchy of small, well-scoped transformations.
Concrete deliverables.
Object detection and classification
YOLO, Detectron, or custom architectures depending on your domain. Trained on labelled images from your actual operating environment, not on the staged photos vendors use for marketing.
Visual quality inspection
Defect detection on production lines, with tolerance thresholds tuned to your quality standards. We capture the lighting and angles you have, not the ones we'd prefer.
Edge or cloud deployment
Sub-100ms inference on edge devices (Jetson, Coral) or GPU-accelerated cloud, depending on which constraint binds first. We make the latency-and-cost tradeoff explicit so you can pick.
Domain-specific fine-tuning
Foundation models fine-tuned on your defects, your tolerances, and the specific quirks of your camera setup. Almost always cheaper than training from scratch and usually more accurate too.
Out-of-distribution images are escalated, not classified.
# vision/inspect.py: refuse to score images the model wasn't trained for
def inspect(image: np.ndarray, ood_threshold: float = 0.15) -> Result:
embedding = encoder(image)
ood_score = mahalanobis(embedding, reference_distribution)
if ood_score > ood_threshold:
return Result(verdict="out_of_distribution", queue_for_review=True)
return Result(verdict=classifier(embedding), queue_for_review=False)Mahalanobis distance against the training distribution. New camera angles, new lighting, get flagged before they get scored.
How it would unfold.
Dataset audit
Image sample review, labelling strategy, success metric (precision, recall, mAP) agreed per defect class.
Pilot
Model trained on labelled set, evaluated on held-out images, tuned to your precision/recall tradeoff.
Deployment
Edge or cloud inference, monitoring, retraining pipeline, integration with your operational system.
Ongoing
Drift monitoring and periodic retraining as operating conditions change.
Tools we reach for on this kind of work.
Common questions.
How many labelled images do we need?
Usually somewhere between 500 and 2,000 to get a useful baseline with transfer learning. Fewer if the defects are visually distinctive, more if they're subtle. We'll commit to a number after seeing the images, not before.
Can it run on a camera at the edge?
Yes. Models compile for Jetson, Coral, or most modern mobile chips. We benchmark latency on the actual hardware before committing to an architecture, because the spec sheet rarely matches the production number.
What if our defects are subtle?
We start by measuring the baseline model on your data and then fine-tune from there. Some defect categories are hard for any vision model to pick up at all, especially when the camera setup or lighting works against the signal. If a class isn't going to clear the target, we'd surface that in week 3 rather than discover it in month 4.
Let's build it.
A 30-minute discovery call. We'll tell you whether we're the right shop for this.
Book a discovery call →