Skip to main content

openai

Reliability/DFX Engineer

San Franciscofulltimemid

About this role

OpenAI seeks an experienced Reliability/DFX Engineer to design and implement design-for-excellence features in next-generation AI accelerator chips. This cross-stack role bridges chip design, platform architecture, and hardware health teams to architect reliable systems from concept through high-volume production.

What you'll do

  • Oversee DFX architecture, implementation, and deployment in silicon, proposing high-ROI features for reliability and fault tolerance
  • Build system-level reliability models using empirical data to guide organizational DFX strategy
  • Collaborate with chip and platform teams to implement DFX features including digital/mixed-signal IP and firmware
  • Partner with hardware health teams to improve reliability in new product introduction and high-volume manufacturing phases
  • Drive data-driven improvements through experimental design and analysis across the hardware stack
  • Serve as DFX/reliability advocate to align industry ecosystem with OpenAI's requirements

What they're looking for

  • RTL design and design-for-test (DFT)
  • Reliability modeling and empirical data analysis
  • ML chip and platform architecture understanding
  • Physical implementation or silicon ATE experience
  • System-level hardware design
  • ML workload characterization
  • Cross-functional collaboration
  • Problem-solving at scale
Apply on the employer's site

Opens the official application on the employer’s site. No login required.