Omeed Tehrani

just another day in paradise

Omeed Tehrani

I'm a researcher and engineer focused on deep reinforcement learning and large language models, with interests spanning AI safety & alignment. My current focus is on efficient ML systems, distributed training optimization, inference acceleration, and the systems-level challenges of scaling frontier models. Long-term, I want to go deeper down the abstraction stack, working on training efficiency, optimization algorithms, and the fundamental mechanisms that make intelligent systems work, whether that's at a frontier lab, research institution, or wherever the most interesting problems are.

I hold both MS and BS degrees in Computer Science from UT Austin, where I conducted research in the RobIN Laboratory(Robotic Interactive Intelligence) on transfer learning, multi-task RL using the MetaWorld manipulation benchmark suite, and return-conditioned sequence modeling with Decision Transformers on robomimic datasets, advised by Dr. Roberto Martin-Martin and visiting scholar Dr. Fernando Fernández Rebollo. I also worked in the AMRL (Autonomous Mobile Robotics Laboratory) with Dr. Joydeep Biswas on inverse kinodynamics for autonomous vehicle drifting. My work was selected for presentation at an Amazon AI Symposium.

My journey through reinforcement learning has been non-linear. Starting with robotic manipulation and policy learning in the RobIN lab, I built intuition for how agents learn from interaction, credit assignment, and the exploration-exploitation tradeoff. Those fundamentals from training robotic arms now inform how I think about LLMs as RL agents, applying insights from offline RL and sequence modeling to distributed training infrastructure, model optimization, and efficient inference at scale. I'm somewhere past the Valley of Despair on the Dunning-Kruger curve, where the derivative is finally positive again, climbing toward actual competence one paper at a time.

The Dunning-Kruger Journey

Knowledge →Confidence →
Emotional journey: Overconfidence → Despair → Enlightenment
I am here
Hover to explore stages

Research Interests

Training & Scaling Frontier Models

How to make training runs cheaper and faster—studying distributed training, optimization algorithms, and the systems challenges of scaling to AGI.

AI Safety & Alignment

Understanding how frontier models actually work and making sure they do what we want—especially as they get closer to human-level intelligence.

RL from Human Feedback

Came from robotic RL research, now interested in how those same ideas apply to aligning LLMs and future AGI systems with human values.

ML Infrastructure at Scale

Building the systems that make frontier research possible—multi-GPU orchestration, efficient inference, and production deployment challenges.

Current Work

AI Infrastructure @ Capital One

Senior Software Engineer • July 2025 - Present

Building production AI infrastructure and agentic systems at enterprise scale. Architecting Model Context Protocol integrations for contextual AI workflows, developing Google A2A agent-to-agent communication patterns, and building scalable Python APIs for LLM orchestration across cloud infrastructure. Working on multi-agent coordination, prompt optimization, and deploying frontier models in production environments. Previously engineered vulnerability data pipelines processing millions of CVE records with distributed systems and real-time analytics.

Independent Research

Self-Directed • Ongoing

Deep-diving into efficient ML systems papers—distributed training optimization, quantization, memory-efficient fine-tuning. Studying recent work on pipeline parallelism, gradient compression, and inference acceleration. Implementing techniques from scratch and documenting findings through technical write-ups.

From Scratch Podcast

Founder • January 2025 - Present

Long-form conversations with researchers, engineers, and builders exploring first-principles thinking in AI, systems design, and the future of intelligent computation.

Constellation 🛰️

Side Project with Friends • 2024 - Present

Working on an interesting space project with some friends—building AI-powered satellite network systems for resilient space telecommunications. Exploring how machine learning can optimize orbital communications and mesh network routing at scale. Still figuring things out, but it's a fun excuse to learn about space systems.

Selected Publications & Research

GigaAPI: A User-Space API for Multi-GPU Programming →

arXiv:2504.01266 • 2025

Designed and implemented a user-space API abstracting multi-GPU programming complexities, enabling developers to leverage parallel GPU systems without deep CUDA expertise. Bridging the gap between hardware capabilities and accessible parallel computing.

Learning Inverse Kinodynamics for Autonomous Vehicle Drifting →

UT Austin AMRL • arXiv:2402.14928 • 2024 • Selected for Amazon AI Symposium

Data-driven kinodynamic model learning for high-speed autonomous drifting on UT Automata platform. Demonstrated successful obstacle avoidance through learned curvature correction.

🚧 WORK IN PROGRESS

Deep RL for Autonomous Drifting: Outperforming Model-Based Control →

Currently Working On • 2024 • Building on IKD Work

Extending my IKD research with end-to-end deep reinforcement learning using Soft Actor-Critic (SAC). Preliminary results show 49% faster task completion (27 vs 53 steps) while maintaining 100% success rate—significantly outperforming both baseline controllers and learned inverse dynamics. Exploring whether RL can discover superior trajectories for complex dynamic tasks compared to hand-engineered approaches.

SACOff-Policy RLF1/10
🚧 WORK IN PROGRESS

Drift Gym: A Production-Grade Environment for Autonomous Drifting Research →

Active Development • 2024

Building a research-grade Gymnasium environment for autonomous drifting with realistic Pacejka tire dynamics, 10+ diverse scenarios (loose, tight, slalom, figure-8), curriculum learning, domain randomization, and full YAML configuration. Designing for reproducible RL research with proper observation spaces, reward shaping for drift control, and deterministic seeding. Using this as the foundation for my continued deep RL experiments on autonomous drifting.

GymnasiumPacejka ModelCurriculum LearningOpen Source

Decision Transformers for Robotic Imitation Learning →

UT Austin RobIN Lab • 2023

Extended Decision Transformer architecture for return-conditioned imitation learning on mixed-quality robomimic datasets. Achieved significant performance improvements over behavioral cloning baselines on manipulation tasks.

Technical Projects

Featured In

Research Journey

Documenting my evolution from robotic RL (RobIN/AMRL) to efficient ML systems and frontier models. Explore my publications, paper analyses, and current reading list on distributed training, quantization, and inference optimization.

View Papers & Reading List →

Let's Connect

I'm always happy to discuss research, collaboration opportunities, or interesting problems in AI safety and RL. Reach out if you're working on alignment, want to chat about papers, or just want to geek out about systems and ML infrastructure.