openai
Training: ML Framework Engineer
San Franciscofulltimemid
About this role
Join OpenAI's Training Runtime team to optimize distributed machine learning infrastructure that powers frontier-scale model training. You'll enhance training throughput by profiling and optimizing frameworks, applying cutting-edge techniques, and enabling researchers to build next-generation models on massive GPU clusters.
What you'll do
- Apply advanced techniques to achieve hardware efficiency in internal training framework
- Profile and optimize training framework performance
- Collaborate with researchers to support next-generation model development
- Design and implement state-of-the-art AI model optimizations
- Write high-quality, bug-free machine learning code
- Improve distributed system performance at scale
What they're looking for
- Python proficiency
- Machine learning systems optimization
- Distributed systems knowledge
- Performance profiling and debugging
- Software engineering best practices
- ML framework experience
- Hardware efficiency optimization
- Supercomputer performance understanding
Benefits
- Hybrid work model (3 days/week in San Francisco office)
- Relocation assistance available
- Work on frontier-scale AI training systems
- Collaborate with leading AI researchers
- Impact large-scale GPU cluster performance
Opens the official application on the employer’s site. No login required.