openai

ML Research Engineer - Hardware Codesign

San Franciscofulltimemid

About this role

Join OpenAI's Hardware team to bridge ML research and silicon design by optimizing numerics, architecture, and performance for next-generation AI silicon. You'll debug performance gaps, prototype low-precision kernels, and drive hardware-software codesign decisions from simulation to production.

What you'll do

Develop and maintain roofline simulators to analyze system architecture tradeoffs and support technology decisions
Debug discrepancies between performance models and actual hardware measurements, identifying bottlenecks and root causes
Write emulation kernels for quantization and compression schemes to evaluate efficiency-quality tradeoffs
Prototype and synthesize novel numeric RTL modules, occasionally owning end-to-end implementations
Evaluate new ML workloads through simulation and functional testing to identify opportunities and risks
Communicate technical tradeoffs across ML research and hardware engineering teams with clear assumptions and evidence

What they're looking for

Python, C++, or Rust programming with focus on correctness and extensibility
CUDA, Triton, or similar kernel development experience
PyTorch or JAX framework expertise
Floating-point numerics and model quantization knowledge
Transformer architecture understanding and large-scale training/inference optimization
RTL design (especially floating-point logic) and PPA tradeoff analysis
Cross-functional collaboration and technical communication
Roofline modeling and performance simulation

Benefits

Hybrid work arrangement (3 days onsite in San Francisco)
Relocation assistance available
Opportunity to work on cutting-edge AI hardware at scale
Collaboration with world-class ML researchers and hardware engineers

Apply on the employer's site →

Opens the official application on the employer’s site. No login required.