Skip to main content

openai

ML Research Engineer - Hardware Codesign

San Franciscofulltimemid

About this role

Join OpenAI's Hardware team to bridge ML research and silicon design by optimizing numerics, architecture, and performance for next-generation AI silicon. You'll debug performance gaps, prototype low-precision kernels, and drive hardware-software codesign decisions from simulation to production.

What you'll do

  • Develop and maintain roofline simulators to analyze system architecture tradeoffs and support technology decisions
  • Debug discrepancies between performance models and actual hardware measurements, identifying bottlenecks and root causes
  • Write emulation kernels for quantization and compression schemes to evaluate efficiency-quality tradeoffs
  • Prototype and synthesize novel numeric RTL modules, occasionally owning end-to-end implementations
  • Evaluate new ML workloads through simulation and functional testing to identify opportunities and risks
  • Communicate technical tradeoffs across ML research and hardware engineering teams with clear assumptions and evidence

What they're looking for

  • Python, C++, or Rust programming with focus on correctness and extensibility
  • CUDA, Triton, or similar kernel development experience
  • PyTorch or JAX framework expertise
  • Floating-point numerics and model quantization knowledge
  • Transformer architecture understanding and large-scale training/inference optimization
  • RTL design (especially floating-point logic) and PPA tradeoff analysis
  • Cross-functional collaboration and technical communication
  • Roofline modeling and performance simulation

Benefits

  • Hybrid work arrangement (3 days onsite in San Francisco)
  • Relocation assistance available
  • Opportunity to work on cutting-edge AI hardware at scale
  • Collaboration with world-class ML researchers and hardware engineers
Apply on the employer's site

Opens the official application on the employer’s site. No login required.