Skip to main content

Exa Labs

Software Engineer, Infrastructure

San Francisco, California$180k–$350kfulltimemidAdded 2 days ago

About this role

Exa, an applied AI lab building next-generation search infrastructure, seeks an Infrastructure Engineer to design and operate massive-scale systems powering web crawling, embedding models, and vector databases. You'll build the foundational tooling that enables rapid innovation across GPU clusters, distributed training, and high-performance inference systems.

What you'll do

  • Scale GPU infrastructure to process web-scale data cost-efficiently across multiple regions and clouds
  • Orchestrate multi-region training and inference workloads on GPU clusters using Kubernetes and Ray
  • Design and maintain advanced LLM gateway and CI/CD systems for AI-native development
  • Build observability and monitoring tooling to support massive distributed systems
  • Create custom build infrastructure and caching solutions using Nix
  • Automate software maintenance and infrastructure improvements company-wide

What they're looking for

  • Distributed systems and large-scale infrastructure design
  • Kubernetes and GPU cluster orchestration
  • Ray or similar distributed computing frameworks
  • Infrastructure-as-code and automation (Nix, Terraform, etc.)
  • CI/CD pipeline design and optimization
  • Multi-cloud and multi-region deployment
  • Systems performance optimization and cost efficiency
  • Observability and monitoring systems

Benefits

  • Premium healthcare (medical, dental, vision)
  • Fertility benefits
  • 16 weeks fully paid parental leave
  • Monthly wellness stipend
  • Visa sponsorship available for international candidates
  • In-person work in San Francisco
Apply on the employer's site

Opens the official application on the employer’s site. No login required.