Anthropic
Performance Engineer, Inference Systems
San Francisco, CA | New York City, NY | Seattle, WAFrom $850kmidAdded 2 days ago
About this role
Anthropic seeks a Performance Engineer to optimize and validate their large-scale Claude inference system across throughput, latency, reliability, and correctness. You'll investigate performance gaps across the entire stack, own correctness evaluation pipelines, and partner with infrastructure teams to drive high-impact optimizations.
What you'll do
- Conduct cross-layer performance investigations to identify gaps between actual fleet performance and theoretical limits
- Own and improve the correctness evaluation pipeline that validates model outputs across hardware platforms and configurations
- Build observability tools, dashboards, and models to make performance metrics and their interactions visible across the stack
- Partner with kernel, serving, and infrastructure teams to prioritize and implement optimization opportunities
- Investigate performance regressions and root causes in production inference systems
- Stack-rank optimization opportunities by impact and effort to guide team priorities
What they're looking for
- Performance engineering and profiling in production systems
- Python and ability to work with large codebases
- Data analysis tools such as SQL and pandas
- Roofline analysis and latency/throughput optimization
- ML inference systems and LLM serving infrastructure
- GPU/TPU accelerator performance concepts
- Distributed systems reliability engineering
- Technical communication and influence through evidence
Opens the official application on the employer’s site. No login required.