Anthropic

Research Engineer, Code RL (Reinforcement Learning)

San Francisco, CA | New York City, NYFrom $850kmidAdded 2 days ago

About this role

Anthropic is seeking a Research Engineer to advance Claude's code generation and software engineering capabilities through reinforcement learning. You'll design RL environments, build reward systems, run training experiments, and optimize pipelines to help AI models write, test, debug, and deploy real software end-to-end.

What you'll do

Design RL environments and coding tasks that teach models real software engineering workflows
Build reward signals and verifiers to define what constitutes correct, high-quality code
Run and analyze training experiments on frontier models to diagnose improvements and failures
Debug system performance across the full stack and optimize training pipelines
Collaborate with alignment and production teams to deploy research innovations at scale
Focus on areas like agentic coding behaviors, code correctness, autonomous engineering, or accelerator performance

What they're looking for

Python expertise including async/concurrent programming
Full-stack system ownership and debugging
Research design and rigorous experimental analysis
Reinforcement learning, RLHF, or LLM post-training
PyTorch and distributed training systems
Program analysis, testing, or formal verification methods
GPU/accelerator performance optimization
Coding agent or sandbox development experience

Apply on the employer's site →

Opens the official application on the employer’s site. No login required.