openai
ML Research Engineer - Hardware Codesign
San Franciscofulltimemid
About this role
Join OpenAI's Hardware team to bridge ML research and silicon design by optimizing numerics, architecture, and performance for next-generation AI silicon. You'll debug performance gaps, prototype low-precision kernels, and drive hardware-software codesign decisions from simulation to production.
What you'll do
- Develop and maintain roofline simulators to analyze system architecture tradeoffs and support technology decisions
- Debug discrepancies between performance models and actual hardware measurements, identifying bottlenecks and root causes
- Write emulation kernels for quantization and compression schemes to evaluate efficiency-quality tradeoffs
- Prototype and synthesize novel numeric RTL modules, occasionally owning end-to-end implementations
- Evaluate new ML workloads through simulation and functional testing to identify opportunities and risks
- Communicate technical tradeoffs across ML research and hardware engineering teams with clear assumptions and evidence
What they're looking for
- Python, C++, or Rust programming with focus on correctness and extensibility
- CUDA, Triton, or similar kernel development experience
- PyTorch or JAX framework expertise
- Floating-point numerics and model quantization knowledge
- Transformer architecture understanding and large-scale training/inference optimization
- RTL design (especially floating-point logic) and PPA tradeoff analysis
- Cross-functional collaboration and technical communication
- Roofline modeling and performance simulation
Benefits
- Hybrid work arrangement (3 days onsite in San Francisco)
- Relocation assistance available
- Opportunity to work on cutting-edge AI hardware at scale
- Collaboration with world-class ML researchers and hardware engineers
Opens the official application on the employer’s site. No login required.