Projects
Production AI Systems
Systems designed and operated at enterprise scale inside large engineering organizations. Described at a high level; source is proprietary.
LLM Evaluation Framework
Designed and operated the evaluation framework for generative AI products in production. Defined methodology across retrieval quality (Recall@k, Precision@k, MRR) and generation quality (groundedness, faithfulness), built on Databricks and Spark and adopted as standard practice across AI product teams.
RAG Pipeline Infrastructure
Architected production RAG pipelines connecting LLMs to enterprise knowledge sources. Designed embedding workflows, vector search infrastructure, and retrieval optimization for high-stakes enterprise workloads where hallucination risk is unacceptable.
Agentic Workflow Platform
Designed and deployed agentic AI systems in production: multi-turn orchestration, function calling, and tool-use patterns at enterprise scale across AWS and Azure. Built the infrastructure enabling LLMs to execute multi-step tasks autonomously against internal systems.
Internal Developer Platform
Led the engineering platform serving hundreds of developers across build, ship, and operate workflows. Designed secure-by-default CI/CD pipelines, containerization strategies, and IaC tooling that drove the shift from monolithic deployments to microservices.
Live Demos
Running systems with public source. These are live; you can interact with them or read the code.
Open Source
Public repos covering RAG evaluation, LLM benchmarking, agentic systems, and AI foundations.
Graduate Research
Projects completed during the UC Berkeley Master of Information and Data Science (MIDS) program, 2021–2023.