Anthropic
Research Engineer, Code RL (Reinforcement Learning)
San Francisco, CA | New York City, NYFrom $850kmidAdded 2 days ago
About this role
Anthropic is seeking a Research Engineer to advance Claude's code generation and software engineering capabilities through reinforcement learning. You'll design RL environments, build reward systems, run training experiments, and optimize pipelines to help AI models write, test, debug, and deploy real software end-to-end.
What you'll do
- Design RL environments and coding tasks that teach models real software engineering workflows
- Build reward signals and verifiers to define what constitutes correct, high-quality code
- Run and analyze training experiments on frontier models to diagnose improvements and failures
- Debug system performance across the full stack and optimize training pipelines
- Collaborate with alignment and production teams to deploy research innovations at scale
- Focus on areas like agentic coding behaviors, code correctness, autonomous engineering, or accelerator performance
What they're looking for
- Python expertise including async/concurrent programming
- Full-stack system ownership and debugging
- Research design and rigorous experimental analysis
- Reinforcement learning, RLHF, or LLM post-training
- PyTorch and distributed training systems
- Program analysis, testing, or formal verification methods
- GPU/accelerator performance optimization
- Coding agent or sandbox development experience
Opens the official application on the employer’s site. No login required.