SRP

Career

Professional Experience

Data Scientist

ExtraHop Networks · Seattle, WA

Apr 2024 - Present

As a Data Scientist on ExtraHop's research team, I build and productionize machine-learning systems that power enterprise-scale threat detection and network intelligence.

  • Productionized a streaming SQL Injection detector using Naive Bayes and Firehose telemetry for drift monitoring, containerized and deployed via ExtraHop's async stack; reduced false positives by 53% while maintaining sub-100 ms latency.
  • Developed a multi-agent RAG system (LangGraph + OpenAI GPT-4o + ChromaDB + BGE embeddings) that parsed 2K+ merge requests to surface tribal knowledge and internal conventions—enabled context-aware code reviews and accelerated development.
  • Deployed a ReAct agent (LangGraph + DSPy + LiteLLM) with web scraping and threat intel tools to generate threat briefings for CVEs, reducing analyst response time from hours to minutes.
  • Optimized GPT-4.1 nano for threat briefing generation using DSPy's GEPA algorithm with LLM-as-Judge evaluation, improving task performance from 71% to 93%. Tracked experiments and traced pipeline runs with MLFlow.
LangGraphGPT-4oChromaDBDSPyLiteLLMMLFlowNaive BayesDockerPython

Decision Analytics Consultant

Investor Group Services (IGS) · Chicago, IL

Jan 2024 - Mar 2024

Drove data infrastructure modernization and analytics automation for due diligence and client case management.

  • Spearheaded migration of the entire legacy database to Microsoft Azure Fabric Environment, leveraging SharePoint Lists for data integration. Developed a real-time client case management tool using PowerBI dashboards, significantly enhancing query response times.
  • Engineered automated scripts for generating due diligence reports and created a geospatial dashboard integrating ArcGIS with PowerBI, providing strategic acquisition insights for clients.
  • Optimized SQL queries to significantly improve the efficiency of the client's data warehousing operations.
Azure FabricPowerBIArcGISSharePointSQL

Data Science Co-op

NCR Corporation · Atlanta, GA

Aug 2023 - Dec 2023

Built a GenAI assistant for enterprise support automation, enabling self-service ticket resolution for 5K+ field technicians.

  • Built a GenAI assistant (Azure OpenAI + LangChain) that resolves technician tickets, unlocking self-service for 5K+ field agents.
  • Curated FAISS + ChromaDB vector stores with ada-002 embeddings to deliver high-recall retrieval grounded in NCR's knowledge base.
Azure OpenAILangChainFAISSChromaDBGPT-4oPython

Data Science Intern

Cognira · Atlanta, GA

May 2023 - Aug 2023

Built PySpark pipelines integrating fragmented retail data into comprehensive promotional datasets, driving inventory optimization.

  • Designed PySpark pipelines, integrating fragmented retail data into comprehensive promotional datasets.
  • Predicted post-promo demand drops using regression models, driving $485K in inventory optimization for a flagship client.
PySparkRegressionPandasSQLRetail Analytics

Machine Learning Engineer

Reliance Jio AI-COE · India

Aug 2020 - Jul 2022

Led AI initiatives for one of the world's largest telecom networks to solve real-time customer quality and performance issues.

  • Optimized ETL pipelines using PySpark & Apache Airflow, reducing batch job runtime from 13h to 3.3h across 400M+ users.
  • Trained an end-to-end Digital Twin for regional internet performance, leveraging LightGBM/XGBoost with advanced feature engineering and tuning — boosted AUC from 0.61 to 0.82 for accurate download speed prediction.
  • Used SHAP explainability to drive root cause analysis in 4G/5G failures — powering proactive service interventions.
  • Integrated root cause codes into network optimization and marketing use cases.
PySparkApache AirflowLightGBMXGBoostSHAPPython

Research And Development Intern

Siemens Technology India · India

Dec 2019 - Jun 2020

Researched behavioral models for autonomous navigation using Reinforcement Learning and Imitation Learning.

  • Researched multiple behavioral models for autonomous navigation using Reinforcement Learning (RL) and Imitation Learning (IL).
  • Finalized the 'Learning by Cheating' method, implementing it through Tensorflow, OpenAI Gym, and CARLA.
  • Devised a pipeline leveraging Mask-RCNN and IOU Tracking to transform raw traffic camera feeds into distinct vehicle trajectories, achieving a 100% surge in trajectory dataset.
TensorflowOpenAI GymCARLAMask-RCNNPythonDeep Learning

Summer Technology Analyst

Morgan Stanley · Bengaluru, India

May 2019 - Jul 2019

Developed big data tooling for querying historical Apache Kafka logs, enabling faster validation of business-critical operations.

  • Developed and deployed a Java-based tool on in-house cloud instances for querying past data from Apache Kafka logs, decreasing query time by approximately 21% across gigabytes of data.
  • Gained in-depth knowledge in developing horizontally scalable big data solutions in a Kafka ecosystem and full-stack development.
JavaApache KafkaBig DataCloudFull-Stack

Education

M.S. Computational Data Analytics

Georgia Tech · Atlanta, GA

Aug 2022 - Dec 2023

M.Tech Computer Science, Dean's Merit List

IIIT Bangalore · Bangalore, India

Jun 2015 - Jul 2020