Accessible Document AI
Making documents accessible to everyone through AI-powered processing
45+
Pages/min
94%
Accuracy
<2s
Latency
Building production AI systems that solve real problems - from driver safety monitoring to intelligent document processing.
A snapshot of who I am and what I do
I'm an AI & Backend Engineer who builds production-grade infrastructure that brings AI to life. My focus is on transforming cutting-edge models into reliable, high-performance applications that solve real-world problems. From real-time computer vision to privacy-first LLM integrations, I architect solutions that are both intelligent and scalable.
My expertise covers RESTful APIs, WebSocket servers for ML model serving, async processing pipelines with RabbitMQ, and inference optimization with TensorFlow Lite and TensorRT. I've architected systems processing video at 30 FPS for drowsiness detection, RAG pipelines handling 45+ pages/min with LangChain, and offline voice assistants achieving <5s response with local LLMs. Whether it's FastAPI, PostgreSQL/MongoDB, or Docker on AWS, I focus on performance and scalability.
I build offline-capable, privacy-respecting systems optimized for edge deployment. I don't just integrate models – I architect the infrastructure around them, ensuring reliability under real-world conditions. My work bridges AI research and production engineering, creating backend systems that are robust, efficient, and ready for scale.

Location
Ahmedabad, India
Experience
3+ Years
Focus
AI & Backend
Deep technical case studies showcasing architecture, challenges, and solutions
Making documents accessible to everyone through AI-powered processing
45+
Pages/min
94%
Accuracy
<2s
Latency
Real-time shoplifting detection using Vision Transformers
30+
FPS
89%
Accuracy
150ms
Latency
Real-time drowsiness detection using computer vision and AI
5-10s
Warning Time
<5%
False Positives
30+
FPS
Offline-capable camera streaming with mobile access
1080p
Resolution
<300ms
Latency
50m
Range
Production-ready RAG system for enterprise documents
100K+
Docs
91%
Accuracy
<3s
Response
Privacy-first voice assistant with complete offline operation
3
Stars
<2s
Response
100%
Privacy
Natural language to SQL/MongoDB queries with RAG
2
Databases
87%
Accuracy
1
Stars
Self-hosted ChatGPT with Ollama and multi-model support
<3s
Response
100%
Privacy
1
Stars
Enterprise-grade microservices backend with event-driven architecture
99.9%
Uptime
8+
Services
10K TPS
Throughput
Technical skills and tools I use to build AI and backend systems
Backend services, ML/vision prototypes, FastAPI, data processing pipelines
Backend microservices, NestJS for API services
Node.js services, backend development
High-performance Python APIs with async support, automatic OpenAPI docs
JavaScript runtime for scalable backend services, event-driven architecture
Service-oriented APIs, dependency injection, scalable architecture
Messaging between services, event-driven patterns
Event streaming for data ingestion pipelines, real-time processing
S3 for video storage, boto3 SDK, cloud infrastructure
Azure AI Document Intelligence, Cosmos DB, Azure OpenAI Services
Containerization, multi-stage builds, image management
Vision OCR, Gemini API integration for document processing
Document store for embeddings, application data, NoSQL queries
Relational storage, complex queries, schema introspection
Vector search for RAG systems, globally distributed NoSQL
Full-text search, analytics, log aggregation
Custom model training, TF Lite optimization for edge deployment
Real-time video processing, facial landmark detection, computer vision pipelines
Face Mesh, Blendshapes, facial landmark tracking for drowsiness detection
ViT-based models for unified detection and pose estimation
Object detection, real-time inference optimization
Google Vision OCR, Azure Document Intelligence, text extraction
Ollama, LangChain, local and cloud LLMs, prompt engineering
Vector embeddings, semantic search, retrieval-augmented generation
Whisper, faster-whisper, CPU-optimized STT for offline processing
Coqui TTS, pyttsx3, multi-engine TTS for voice assistants
Camera streaming from Raspberry Pi, mobile access via hotspot
Real-time communication for monitoring and control
Offline streaming setups, camera integration
Have a project in mind, want to collaborate, or just say hello? I'd love to hear from you.
I'm currently open to new opportunities and interesting projects. Whether you're looking for a full-time developer or need help with a specific AI/backend challenge, let's chat!
Location
India