Rohit Kumar

AI/ML Engineer with 2+ years of hands-on experience contributing to production-grade AI systems, RAG pipelines, and intelligent automation solutions. I have worked on designing, developing, and deploying ML solutions using transformer-based models, with a strong focus on reliable MLOps practices.

10+ AI Projects
2+ Years Experience
8+ Technologies

About Me

Building intelligent systems that solve real-world problems

🎓 B.Tech in Computer Science & Engineering
Swami Vivekanand Institute of Engineering & Technology • CGPA: 8.5/10
📍 Banur, Punjab, India

Professional Experience

Building AI solutions at scale for leading organizations

Associate AI/ML Engineer

EMINENCE INTERNET TECHNOLOGY PVT. LTD.
April 2024 — Present | Mohali, India
LangChain LangGraph PyTorch RAG NLP FastAPI Vector DBs MLOps

Software Developer

Shine Dezign Pvt Ltd
6 months(Internship)

Developed and maintained PHP-based web applications with an emphasis on backend performance, clean architecture, and reliable API integrations. Worked on responsive UI integration and optimized server-side logic to improve application efficiency and user experience.

Featured Projects

Production-grade AI solutions across multiple domains

📱 Product Image Classification

Built an end-to-end deep learning pipeline for classifying product images as Mobile Phones or Laptops using TensorFlow and MobileNetV2. Implements complete ML workflow including dataset preparation, training, evaluation, and deployment-ready inference via FastAPI. Features binary classification with transfer learning, achieving high accuracy on custom datasets for e-commerce automation.

TensorFlow MobileNetV2 FastAPI Transfer Learning Image Classification E-commerce AI

🎵 Deepfake Audio Detection Model

Developed a machine learning system to detect deepfake/synthetic audio using Wav2Vec2 embeddings and classical ML classifiers. Achieved 92.86% accuracy with Logistic Regression on the Real vs Fake Human Voice dataset (70k samples). Pipeline extracts 768-dimensional feature vectors, handles variable-length audio, and implements preprocessing with StandardScaler normalization. Trained and compared Logistic Regression (best), SVM, and Random Forest models.

Wav2Vec2 Audio ML Logistic Regression SVM Deepfake Detection Security

🤖 Multi-Agent RAG System with LangGraph

Architected an enterprise-grade multi-agent RAG system using LangGraph orchestration. Implements specialized agents for document retrieval, synthesis, fact-checking, and response generation. Features dynamic routing, agent collaboration, and context-aware memory management with 95%+ answer accuracy on domain-specific queries.

LangGraph Multi-Agent RAG Pinecone FastAPI Redis

🎯 Fine-Tuned Sentiment Analysis Models

Fine-tuned BERT and T5 transformer models for domain-specific sentiment analysis. Implemented transfer learning, data augmentation, and advanced preprocessing. Achieved 94% F1-score on custom dataset with balanced precision-recall. Deployed with FastAPI for real-time inference.

BERT T5 Transfer Learning HuggingFace PyTorch FastAPI

💻 LLaMA Command Intent Classifier

Fine-tuned LLaMA 3.1 8B for classifying Linux commands and natural language into predefined intents. Built for AI terminal assistants and DevOps automation. Achieved 96% accuracy using LoRA fine-tuning with custom prompt-completion dataset.

LLaMA 3.1 LoRA Transformers NLP Classification

🎙️ AI-Powered Podcast Intelligence Platform

Built a production-grade podcast processing system with speaker diarization, multi-host/guest identification, and automatic music filtering. Implemented real-time line-level editing, WebSocket-based progress tracking, and chunked long-form processing using a sliding-window approach for LLM limits. Integrated LLM-driven summarization, sentiment analysis with timestamps, and semantic search via Pinecone, with transcripts securely stored in AWS S3.

AssemblyAI ElevenLabs AWS S3 WebSocket LangChain Pinecone Speaker Diarization Music Detection Real-time Editing Chunked Processing Background Jobs NLP

📩 AI-Driven Customer Support Automation

Built an intelligent complaint handling system using CrewAI multi-agent framework. Automatically processes text and audio inputs, classifies issues, verifies against policies, generates responses, and integrates with CRM. Reduced response time by 70% while maintaining quality.

CrewAI OpenAI MySQL CRM Automation Multi-Agent

🤖 N8N Multi-Agent Workflow Orchestration

Engineered a sophisticated n8n workflow with dual AI agents, persistent MongoDB memory, and intelligent routing. Implements context-aware conversations, webhook triggers, and modular architecture for scalable automation across multiple domains and use cases.

n8n OpenAI MongoDB Webhook Automation

🤖 IoT Device Control via ADB & MQTT

Developed a remote control system for Hisense TV and Fire TV using ADB and MQTT protocols. Enables seamless device communication, Android automation, and real-time command execution for smart home integration.

ADB MQTT IoT Python

🕷️ ML-Powered Intelligent Web Scraper

Built an ML-powered web scraping system to handle dynamic popups, CAPTCHA detection, and anti-bot measures. Trained a CNN-based popup detection model enabling automated interaction and seamless scraping across 50+ websites, achieving 98% accuracy using Selenium with headless Chrome.

Selenium CNN Computer Vision Web Scraping BeautifulSoup Automation Anti-Bot

🎤 Ultra-Low Latency Voice AI (~1.5s)

Built a high-performance speech-to-speech system with industry-leading latency. Uses Deepgram for STT, Groq LLM for rapid inference, and ElevenLabs for natural TTS. Implements streaming responses, ChromaDB for knowledge retrieval, and optimized pipeline achieving consistent sub-2-second response times.

Deepgram STT Groq LLM ElevenLabs TTS ChromaDB Streaming Low Latency

💬 Enterprise Conversational AI Platform

Developed a production-ready chatbot with LangChain and ChromaDB. Features voice interaction, real-time streaming via WebSocket, conversation memory, and automatic HTML transcript generation sent via SMTP. Handles context across sessions with personalized responses.

LangChain ChromaDB OpenAI TTS/STT WebSocket SMTP

🎥 InstantMeet - WebRTC Video Platform

Built a production-grade video conferencing app with FastAPI and WebRTC. Features instant meeting creation, multi-participant support, text chat, user authentication, OTP recovery, and optional recordings. Optimized for low latency and high concurrent user capacity.

FastAPI WebRTC WebSocket SQLite Real-time

Technical Expertise

Comprehensive skill set spanning AI/ML, automation, scraping, and backend development

🤖 AI/ML & Deep Learning

PyTorch TensorFlow Transformers HuggingFace LangChain LangGraph CrewAI AutoGen NLP Computer Vision RAG Fine-tuning

🔧 Automation & Workflow

n8n Langflow Zapier Zoho Process Automation Workflow Orchestration

🕷️ Web Scraping & Data Extraction

Selenium BeautifulSoup Scrapy Playwright Requests Anti-Bot Handling

⚙️ Backend & APIs

FastAPI Django Flask Python REST APIs WebSocket

💾 Databases & Vector Stores

PostgreSQL MySQL MongoDB SQLite ChromaDB Pinecone Redis FAISS

🚀 DevOps & Cloud

Docker Git/GitHub Linux CI/CD AWS Basic MLOps

Let's Connect

Open to exciting opportunities and collaborations

📱

Phone

+91 7783805286

+91 9576243008

💼

LinkedIn

View Profile
💻

GitHub

View Repositories
🤗

Hugging Face

View Models