Omeed Tehrani
just another day in paradise

I'm a researcher and engineer focused on deep reinforcement learning and large language models, with interests spanning AI safety & alignment. My current focus is on efficient ML systems, distributed training optimization, inference acceleration, and the systems-level challenges of scaling frontier models. Long-term, I want to go deeper down the abstraction stack, working on training efficiency, optimization algorithms, and the fundamental mechanisms that make intelligent systems work, whether that's at a frontier lab, research institution, or wherever the most interesting problems are.
I hold both MS and BS degrees in Computer Science from UT Austin, where I conducted research in the RobIN Laboratory(Robotic Interactive Intelligence) on transfer learning, multi-task RL using the MetaWorld manipulation benchmark suite, and return-conditioned sequence modeling with Decision Transformers on robomimic datasets, advised by Dr. Roberto Martin-Martin and visiting scholar Dr. Fernando Fernández Rebollo. I also worked in the AMRL (Autonomous Mobile Robotics Laboratory) with Dr. Joydeep Biswas on inverse kinodynamics for autonomous vehicle drifting. My work was selected for presentation at an Amazon AI Symposium.
My journey through reinforcement learning has been non-linear. Starting with robotic manipulation and policy learning in the RobIN lab, I built intuition for how agents learn from interaction, credit assignment, and the exploration-exploitation tradeoff. Those fundamentals from training robotic arms now inform how I think about LLMs as RL agents, applying insights from offline RL and sequence modeling to distributed training infrastructure, model optimization, and efficient inference at scale. I'm somewhere past the Valley of Despair on the Dunning-Kruger curve, where the derivative is finally positive again, climbing toward actual competence one paper at a time.
The Dunning-Kruger Journey
Research Interests
Training & Scaling Frontier Models
How to make training runs cheaper and faster—studying distributed training, optimization algorithms, and the systems challenges of scaling to AGI.
AI Safety & Alignment
Understanding how frontier models actually work and making sure they do what we want—especially as they get closer to human-level intelligence.
RL from Human Feedback
Came from robotic RL research, now interested in how those same ideas apply to aligning LLMs and future AGI systems with human values.
ML Infrastructure at Scale
Building the systems that make frontier research possible—multi-GPU orchestration, efficient inference, and production deployment challenges.
Current Work
AI Infrastructure @ Capital One
Senior Software Engineer • July 2025 - Present
Building production AI infrastructure and agentic systems at enterprise scale. Architecting Model Context Protocol integrations for contextual AI workflows, developing Google A2A agent-to-agent communication patterns, and building scalable Python APIs for LLM orchestration across cloud infrastructure. Working on multi-agent coordination, prompt optimization, and deploying frontier models in production environments. Previously engineered vulnerability data pipelines processing millions of CVE records with distributed systems and real-time analytics.
Independent Research
Self-Directed • Ongoing
Deep-diving into efficient ML systems papers—distributed training optimization, quantization, memory-efficient fine-tuning. Studying recent work on pipeline parallelism, gradient compression, and inference acceleration. Implementing techniques from scratch and documenting findings through technical write-ups.
From Scratch Podcast
Founder • January 2025 - Present
Long-form conversations with researchers, engineers, and builders exploring first-principles thinking in AI, systems design, and the future of intelligent computation.
Constellation 🛰️
Side Project with Friends • 2024 - Present
Working on an interesting space project with some friends—building AI-powered satellite network systems for resilient space telecommunications. Exploring how machine learning can optimize orbital communications and mesh network routing at scale. Still figuring things out, but it's a fun excuse to learn about space systems.
Selected Publications & Research
GigaAPI: A User-Space API for Multi-GPU Programming →
arXiv:2504.01266 • 2025
Designed and implemented a user-space API abstracting multi-GPU programming complexities, enabling developers to leverage parallel GPU systems without deep CUDA expertise. Bridging the gap between hardware capabilities and accessible parallel computing.
Learning Inverse Kinodynamics for Autonomous Vehicle Drifting →
UT Austin AMRL • arXiv:2402.14928 • 2024 • Selected for Amazon AI Symposium
Data-driven kinodynamic model learning for high-speed autonomous drifting on UT Automata platform. Demonstrated successful obstacle avoidance through learned curvature correction.
Deep RL for Autonomous Drifting: Outperforming Model-Based Control →
Currently Working On • 2024 • Building on IKD Work
Extending my IKD research with end-to-end deep reinforcement learning using Soft Actor-Critic (SAC). Preliminary results show 49% faster task completion (27 vs 53 steps) while maintaining 100% success rate—significantly outperforming both baseline controllers and learned inverse dynamics. Exploring whether RL can discover superior trajectories for complex dynamic tasks compared to hand-engineered approaches.
Drift Gym: A Production-Grade Environment for Autonomous Drifting Research →
Active Development • 2024
Building a research-grade Gymnasium environment for autonomous drifting with realistic Pacejka tire dynamics, 10+ diverse scenarios (loose, tight, slalom, figure-8), curriculum learning, domain randomization, and full YAML configuration. Designing for reproducible RL research with proper observation spaces, reward shaping for drift control, and deterministic seeding. Using this as the foundation for my continued deep RL experiments on autonomous drifting.
Decision Transformers for Robotic Imitation Learning →
UT Austin RobIN Lab • 2023
Extended Decision Transformer architecture for return-conditioned imitation learning on mixed-quality robomimic datasets. Achieved significant performance improvements over behavioral cloning baselines on manipulation tasks.
Technical Projects
TinyRL-Tetris
RLImplementing deep RL algorithms from scratch for Tetris—exploring DQN, policy gradients, and reward shaping as a pedagogical exercise in understanding RL fundamentals.
Starlink Satellite Simulator
SystemsProduction-grade satellite communication simulator with RF physics, regulatory compliance, dynamic beam steering, and real-time link analysis. Built for aerospace mission planning.
Block-Wise Hierarchical Transformer
LLMPyTorch chatbot implementation with self-attention mechanisms achieving 1.32 perplexity. Exploration of transformer architectures, attention patterns, and sequence-to-sequence learning.
MemPharos
SystemsUser-space paging system with custom memory manipulation—intercepting system signals for demand-loading ELF binaries. Deep dive into OS internals and virtual memory management.
RemoteSyncFS
DistributedFUSE-based distributed file system with remote synchronization capabilities. Exploring consistency models, conflict resolution, and distributed systems primitives.
StrategoSpheres
AI/SearchAdversarial search algorithms and heuristic evaluation for zero-sum strategic games. Minimax, alpha-beta pruning, and advanced game tree search optimization techniques.
Featured In
UTCS Alumnus: Changing Paths and Finding Purpose in Tech →
2025UT Austin Computer Science • Profile on my journey through graduate research, startups, and finding purpose in AI engineering
Eighteen UTCS Students Awarded Endowed Presidential Scholarships →
2021UT Austin Computer Science • Recognized for academic excellence with W.D. Blunk Endowed Presidential Scholarship
Research Journey
Documenting my evolution from robotic RL (RobIN/AMRL) to efficient ML systems and frontier models. Explore my publications, paper analyses, and current reading list on distributed training, quantization, and inference optimization.
View Papers & Reading List →Let's Connect
I'm always happy to discuss research, collaboration opportunities, or interesting problems in AI safety and RL. Reach out if you're working on alignment, want to chat about papers, or just want to geek out about systems and ML infrastructure.