Observer Chain - Theft Detection
Real-time shoplifting detection using Vision Transformers
Tech Stack
30+
FPS
89%
Accuracy
150ms
Latency
Problem Statement
Retail stores lose billions annually to shoplifting, and traditional CCTV requires constant human monitoring which is expensive and error-prone.
Overview
An intelligent surveillance system that combines object detection, pose estimation, and behavioral analysis to identify potential theft in retail environments. Uses Vision Transformers for unified detection and MediaPipe for keypoint extraction.
My Role & Contributions
Lead Developer - Architected the detection pipeline, integrated multiple vision models, built the real-time streaming infrastructure.
Tech Stack
Challenges & Solutions
Challenge
Achieving 30 FPS multi-person tracking with Vision Transformers on edge hardware with 4GB RAM constraints
Solution
Optimized ViT inference with TensorRT INT8 quantization, achieving 3.2x speedup; implemented frame skipping with optical flow interpolation
Challenge
Reducing false positive rate from 35% to <11% while maintaining 89% detection accuracy across diverse retail layouts
Solution
Designed multi-stage confidence scoring: YOLO (person detection 0.7), MediaPipe (pose validation), behavior classifier (3s temporal window)
Challenge
Building horizontally scalable WebSocket infrastructure handling 50+ concurrent camera streams with <150ms latency
Solution
Architected event-driven system with RabbitMQ fanout exchanges, MongoDB change streams, and Redis pub/sub for real-time alert distribution