Skip to main content

Anthropic

Performance Engineer, Inference Systems

San Francisco, CA | New York City, NY | Seattle, WAFrom $850kmidAdded 2 days ago

About this role

Anthropic seeks a Performance Engineer to optimize and validate their large-scale Claude inference system across throughput, latency, reliability, and correctness. You'll investigate performance gaps across the entire stack, own correctness evaluation pipelines, and partner with infrastructure teams to drive high-impact optimizations.

What you'll do

  • Conduct cross-layer performance investigations to identify gaps between actual fleet performance and theoretical limits
  • Own and improve the correctness evaluation pipeline that validates model outputs across hardware platforms and configurations
  • Build observability tools, dashboards, and models to make performance metrics and their interactions visible across the stack
  • Partner with kernel, serving, and infrastructure teams to prioritize and implement optimization opportunities
  • Investigate performance regressions and root causes in production inference systems
  • Stack-rank optimization opportunities by impact and effort to guide team priorities

What they're looking for

  • Performance engineering and profiling in production systems
  • Python and ability to work with large codebases
  • Data analysis tools such as SQL and pandas
  • Roofline analysis and latency/throughput optimization
  • ML inference systems and LLM serving infrastructure
  • GPU/TPU accelerator performance concepts
  • Distributed systems reliability engineering
  • Technical communication and influence through evidence
Apply on the employer's site

Opens the official application on the employer’s site. No login required.