Career
Professional Experience
Data Scientist
ExtraHop Networks · Seattle, WA
Apr 2024 - Present
As a Data Scientist on ExtraHop's research team, I build and productionize machine-learning systems that power enterprise-scale threat detection and network intelligence.
- Productionized a streaming SQL Injection detector using Naive Bayes and Firehose telemetry for drift monitoring, containerized and deployed via ExtraHop's async stack; reduced false positives by 53% while maintaining sub-100 ms latency.
- Developed a multi-agent RAG system (LangGraph + OpenAI GPT-4o + ChromaDB + BGE embeddings) that parsed 2K+ merge requests to surface tribal knowledge and internal conventions—enabled context-aware code reviews and accelerated development.
- Deployed a ReAct agent (LangGraph + DSPy + LiteLLM) with web scraping and threat intel tools to generate threat briefings for CVEs, reducing analyst response time from hours to minutes.
- Optimized GPT-4.1 nano for threat briefing generation using DSPy's GEPA algorithm with LLM-as-Judge evaluation, improving task performance from 71% to 93%. Tracked experiments and traced pipeline runs with MLFlow.
Decision Analytics Consultant
Investor Group Services (IGS) · Chicago, IL
Jan 2024 - Mar 2024
Drove data infrastructure modernization and analytics automation for due diligence and client case management.
- Spearheaded migration of the entire legacy database to Microsoft Azure Fabric Environment, leveraging SharePoint Lists for data integration. Developed a real-time client case management tool using PowerBI dashboards, significantly enhancing query response times.
- Engineered automated scripts for generating due diligence reports and created a geospatial dashboard integrating ArcGIS with PowerBI, providing strategic acquisition insights for clients.
- Optimized SQL queries to significantly improve the efficiency of the client's data warehousing operations.
Data Science Co-op
NCR Corporation · Atlanta, GA
Aug 2023 - Dec 2023
Built a GenAI assistant for enterprise support automation, enabling self-service ticket resolution for 5K+ field technicians.
- Built a GenAI assistant (Azure OpenAI + LangChain) that resolves technician tickets, unlocking self-service for 5K+ field agents.
- Curated FAISS + ChromaDB vector stores with ada-002 embeddings to deliver high-recall retrieval grounded in NCR's knowledge base.
Data Science Intern
Cognira · Atlanta, GA
May 2023 - Aug 2023
Built PySpark pipelines integrating fragmented retail data into comprehensive promotional datasets, driving inventory optimization.
- Designed PySpark pipelines, integrating fragmented retail data into comprehensive promotional datasets.
- Predicted post-promo demand drops using regression models, driving $485K in inventory optimization for a flagship client.
Machine Learning Engineer
Reliance Jio AI-COE · India
Aug 2020 - Jul 2022
Led AI initiatives for one of the world's largest telecom networks to solve real-time customer quality and performance issues.
- Optimized ETL pipelines using PySpark & Apache Airflow, reducing batch job runtime from 13h to 3.3h across 400M+ users.
- Trained an end-to-end Digital Twin for regional internet performance, leveraging LightGBM/XGBoost with advanced feature engineering and tuning — boosted AUC from 0.61 to 0.82 for accurate download speed prediction.
- Used SHAP explainability to drive root cause analysis in 4G/5G failures — powering proactive service interventions.
- Integrated root cause codes into network optimization and marketing use cases.
Research And Development Intern
Siemens Technology India · India
Dec 2019 - Jun 2020
Researched behavioral models for autonomous navigation using Reinforcement Learning and Imitation Learning.
- Researched multiple behavioral models for autonomous navigation using Reinforcement Learning (RL) and Imitation Learning (IL).
- Finalized the 'Learning by Cheating' method, implementing it through Tensorflow, OpenAI Gym, and CARLA.
- Devised a pipeline leveraging Mask-RCNN and IOU Tracking to transform raw traffic camera feeds into distinct vehicle trajectories, achieving a 100% surge in trajectory dataset.
Summer Technology Analyst
Morgan Stanley · Bengaluru, India
May 2019 - Jul 2019
Developed big data tooling for querying historical Apache Kafka logs, enabling faster validation of business-critical operations.
- Developed and deployed a Java-based tool on in-house cloud instances for querying past data from Apache Kafka logs, decreasing query time by approximately 21% across gigabytes of data.
- Gained in-depth knowledge in developing horizontally scalable big data solutions in a Kafka ecosystem and full-stack development.
Education
M.S. Computational Data Analytics
Georgia Tech · Atlanta, GA
Aug 2022 - Dec 2023
M.Tech Computer Science, Dean's Merit List
IIIT Bangalore · Bangalore, India
Jun 2015 - Jul 2020
