openai
Software Engineer, Caching Infrastructure
San Franciscofulltimemid
About this role
OpenAI is seeking an experienced Software Engineer to design and scale a multi-tenant caching infrastructure platform that supports critical services across inference, identity, and product experiences. You'll define the long-term vision for caching capabilities while collaborating with infrastructure and product teams to ensure high performance, reliability, and cost-efficiency.
What you'll do
- Design, build, and operate OpenAI's multi-tenant caching platform supporting inference, identity, quota, and product systems
- Define long-term vision and roadmap for caching infrastructure, balancing performance, durability, and cost
- Collaborate with networking, observability, database, and product teams to meet platform requirements
- Optimize for latency, reliability, throughput, and cost in platform design decisions
- Implement and tune distributed caching solutions in production environments
- Operate and scale Kubernetes-based caching services with autoscaling capabilities
What they're looking for
- Distributed systems design and scaling (5+ years experience)
- Redis or Memcached expertise including clustering and tuning
- Kubernetes and service orchestration
- Service meshes and load balancing (e.g., Envoy)
- Networking fundamentals
- System performance optimization and latency analysis
- Production infrastructure experience
- Client-side connection patterns and durability configurations
Opens the official application on the employer’s site. No login required.