Anthropic

Performance Engineer, Inference Systems

San Francisco, CA | New York City, NY | Seattle, WAFrom $850kmidAdded 2 days ago

About this role

Anthropic seeks a Performance Engineer to optimize and validate their large-scale Claude inference system across throughput, latency, reliability, and correctness. You'll investigate performance gaps across the entire stack, own correctness evaluation pipelines, and partner with infrastructure teams to drive high-impact optimizations.

What you'll do

Conduct cross-layer performance investigations to identify gaps between actual fleet performance and theoretical limits
Own and improve the correctness evaluation pipeline that validates model outputs across hardware platforms and configurations
Build observability tools, dashboards, and models to make performance metrics and their interactions visible across the stack
Partner with kernel, serving, and infrastructure teams to prioritize and implement optimization opportunities
Investigate performance regressions and root causes in production inference systems
Stack-rank optimization opportunities by impact and effort to guide team priorities

What they're looking for

Performance engineering and profiling in production systems
Python and ability to work with large codebases
Data analysis tools such as SQL and pandas
Roofline analysis and latency/throughput optimization
ML inference systems and LLM serving infrastructure
GPU/TPU accelerator performance concepts
Distributed systems reliability engineering
Technical communication and influence through evidence

Apply on the employer's site →

Opens the official application on the employer’s site. No login required.