Skip to main content

openai

Distributed Training Engineer, Sora

San Franciscofulltimemid

About this role

Join OpenAI's Sora team to optimize distributed training systems for video foundation models. You'll enhance training throughput, enable researcher experimentation, and apply performance optimization techniques to large-scale AI infrastructure.

What you'll do

  • Collaborate with researchers to develop systems-efficient video model architectures
  • Apply advanced techniques to the internal training framework for optimal hardware efficiency
  • Profile and optimize training performance across distributed systems
  • Design and implement state-of-the-art AI model improvements
  • Write reliable, bug-free machine learning code for training pipelines

What they're looking for

  • Distributed systems optimization
  • Python proficiency
  • Training kernel optimization
  • Multi-modal ML pipelines
  • Systems-level performance analysis
  • Strong software engineering practices
  • Supercomputer performance understanding
  • Training dynamics and stability

Benefits

  • Hybrid work model (3 days/week in office)
  • San Francisco location
  • Relocation assistance available
  • Work on cutting-edge AI video capabilities
  • Collaborate with top research and product teams
Apply on the employer's site

Opens the official application on the employer’s site. No login required.