Skip to main content

openai

Research Engineer, Frontier Evals & Environments

San Franciscofulltimemid

About this role

Help build evaluation environments and benchmarks that measure and guide the development of frontier AI agents at OpenAI. You'll create RL environments, design measurement methodologies, and directly influence training runs that shape next-generation model capabilities.

What you'll do

  • Design and build RL environments to test frontier model capabilities and behaviors
  • Develop methodologies for automatically exploring and understanding model behavior
  • Conduct rigorous analysis on evaluation scalability, reliability, and measurement variance
  • Guide training decisions for large-scale model runs and measure their outcomes
  • Build scalable systems for continuous evaluation across training pipelines
  • Create self-improvement loops to automate model understanding and analysis

What they're looking for

  • Machine learning fundamentals and systems thinking
  • LLM and reinforcement learning experience (RLHF/RLAIF preferred)
  • Software engineering and production ML systems
  • Evaluation, grading, and synthetic data methodology
  • Statistical analysis and experimental design
  • Cross-functional communication and collaboration
  • Research taste combined with engineering execution
  • Coding agents and tool-using agent experience
Apply on the employer's site

Opens the official application on the employer’s site. No login required.