Staff ML engineer | AI products | Agentic systems

I build production AI products.

Staff-level ML engineer in Bengaluru with 8+ years across agentic AI, AI product platforms, production ML systems, and edge computer vision. At SolarWinds, I work in the Platform ML Team under Platform Engineering, building AI capabilities for ITOM and ITSM products across SaaS and self-hosted deployments.

Selected work GitHub LinkedIn Email

Nihal Rao I Staff Machine Learning Engineer building AI products from Bengaluru, India.

Current focus Agentic AI products

Operating scale 10M+ daily calls

Core lane AI products + ML systems

Target roles Staff / Senior Staff

prod.ai.workflow 150 rps

Proof points

The pattern is AI product ownership.

My strongest work sits where AI ideas have to become real product systems: workflows, APIs, auth, rate limits, evals, dashboards, SLOs, docs, and on-call ownership.

100% of engineering LLM traffic routed through an org-wide LiteLLM gateway.

5+ product lines aligned around an AI copilot that started as a hackathon prototype.

10-15M metric anomaly detection requests per day for entity health and alert noise reduction.

$0.00001 approximate cost per request on the anomaly detection service.

Profile

AI products, agentic systems, and production ML.

I work at SolarWinds, an ITOM/ITSM company that builds observability, monitoring, and service-management products across SaaS and self-hosted deployments. I sit in the Platform ML Team under Platform Engineering, where I originated and drive an AI copilot strategy across multiple product lines. The work spans multi-agent orchestration, MCP-based tool integration, RAG, SSE APIs, conversation management, reliability, and evaluation.

Before the current AI product work, I built AIOps systems for metric anomaly detection, alert noise reduction, RCA signal correlation, and log processing. Earlier roles covered retail edge computer vision at Infilect and CV/ML R&D at Tata Elxsi.

Apr 2024 - Present | SolarWinds, Platform Engineering / Platform ML

Staff Machine Learning Engineer

AI copilot originator and DRI, org-wide LLM gateway owner, investigative agent contributor.

Feb 2022 - Mar 2024 | SolarWinds, Platform ML / AIOps

Senior Machine Learning Engineer

AIOps, service-desk AI, metric anomaly detection, RCA correlation, log pattern mining.

Jun 2020 - Feb 2022 | Infilect

Senior AI Engineer / AI Engineer

TensorFlow Lite pipelines, retail loss detection, SKU recognition, image stitching.

Oct 2017 - May 2020 | Tata Elxsi

Senior Software Engineer / Software Engineer

Computer vision R&D for traffic-sign recognition, lane recognition, text detection, and video metadata.

Selected work

AI products with clear operating constraints.

These are not toy demos. The through-line is turning AI capability into useful product experiences with the service boundaries, controls, and feedback loops needed to operate them.

AI Copilot for SolarWinds Products

Conceived and prototyped a multi-agent AI copilot in a one-week internal hackathon, then drove it into central product strategy across 5+ ITOM/ITSM product lines. Designed the agent framework, MCP tool layer, conversation APIs, RAG layer, SSE streaming contracts, and evaluation tooling.

Agentic AI MCP RAG SSE SLOs

5+ lines product strategy impact

5+ clients conversation API consumers

Org-Wide LLM Gateway

Built and continue to own a centralized provider-agnostic LLM proxy on LiteLLM. It handles auth, model policy, distributed Redis rate limiting, Presidio-based PII masking, audit logs, spend governance, metrics, and client onboarding for engineering teams.

LiteLLM Kubernetes Redis Presidio

100% engineering LLM traffic

20+ teams feature consumers

Metric Anomaly Detection and Alert Noise Reduction

Shipped a time-series metric anomaly detection service that forecasted expected metric behavior to understand entity health and reduce alert noise. The service ran at roughly 10-15M requests per day, around 150 RPS, with an approximate per-request cost of one hundred-thousandth of a US dollar.

Time series Anomaly detection Fourier RWD Prophet

10-15M requests per day

~150 requests per second

major alert noise reduction

Drain3 Log Pattern Mining

Built log processing systems around Drain3 template mining to compress repetitive logs, surface unusual log patterns, and select higher-signal context for LLM-driven incident analysis. The goal was smaller context windows with more diagnostic value, not clustering for its own sake.

Drain3 Log mining Compression LLM context AIOps

lower context size

higher signal density

AI Service-Desk Agent

Co-designed and built an LLM/RAG-powered assistant for ITSM: recommended resolutions, ticket summaries, and actionable runbooks from process documentation. Supported PII-safe routing with on-demand and auto-trigger modes.

RAG ITSM Runbooks PII safety

150K tickets per day

40-50% MTTR reduction

Open source

Public repos that match the work.

I kept this section grounded in my public GitHub repositories rather than presenting private company work as open source.

Python | MCP | OpenAPI

1 star

openapi-to-mcp

Installable CLI for diagnosing, diffing, generating, running, and testing Node.js/TypeScript MCP servers from OpenAPI specs.

Python TypeScript CLI Docs site

Python | React | Voice AI

2026

volta

Local realtime voice app for low-latency STT to LLM to TTS conversations, persona control, saved runs, and a browser evaluation harness on Apple Silicon.

FastAPI Vite Realtime Evaluation

Python | OR-Tools | Scheduling

Apache 2.0

chronosolve

Generalized university timetable solver using constraint optimization with configurable hard constraints, soft preferences, YAML/JSON input, and a planned desktop UI.

CP-SAT Pydantic Typer Tauri planned

Python | PyQt5 | Scheduling

28 stars | 9 forks

timetable

Earlier university timetable generator for a highly constrained NP-hard scheduling problem, built as a Python desktop application during engineering.

Python 3 PyQt5 Heuristics Education

Skills

Tools I use when an AI prototype has to become a product.

AI product engineering

Agentic workflows, product copilots, AI assistants
Conversation APIs, SSE streaming, UX-facing AI systems
Evaluation loops, guardrails, reliability, launch readiness

LLM and platform systems

MCP, LangGraph, LiteLLM, RAG, provider routing
AWS Bedrock, Azure OpenAI, PII masking, audit controls
Kubernetes, Docker, Helm, Terraform, ArgoCD

Production ML systems

Python, FastAPI, REST/OpenAPI, SSE streaming
Redis, PostgreSQL, Kafka, Prometheus, OpenTelemetry
Metric anomaly detection, CV/edge ML, LoRA

Contact

Tell me about the system you are trying to ship.

Best fit: Staff or Senior Staff IC roles building AI products, agentic workflows, AI-assisted developer or customer experiences, and ML systems that need to operate reliably at scale.

Emailnihaliddya@gmail.com GitHub@nihal1294 LinkedIn/in/nihaliddya ResumeGoogle Drive