Skip to main content

Sierra

Software Engineer, Site Reliability (SRE)

San Francisco, CA$230k–$390kfulltimemidAdded 2 days ago

About this role

Sierra is seeking a Software Engineer for their Site Reliability team to enhance the reliability and scalability of their AI-driven infrastructure. The role involves overseeing observability systems, designing secure cloud infrastructure, and collaborating with engineering teams to ensure high availability and performance.

What you'll do

  • Manage observability stack for system health monitoring
  • Collaborate on designing reliable and scalable systems
  • Implement cloud infrastructure using Terraform and DevOps tools
  • Enhance reliability of LLM deployments for effective operations
  • Optimize deployment pipelines and incident management processes
  • Establish SRE practices and influence engineering culture

What they're looking for

  • 5+ years in Site Reliability or Infrastructure engineering
  • Expertise in Terraform and AWS services
  • Strong in observability tools like Prometheus and Grafana
  • Experience designing for system availability and scalability
  • Familiarity with container orchestration and networking
  • Ability to work collaboratively in dynamic environments
  • Degree in Computer Science or related field

Benefits

  • Innovative work environment
  • Collaborative team culture
  • Strong focus on professional growth
  • Opportunity to work with advanced AI technologies
  • In-person collaboration in San Francisco and other global offices
  • [unknown]
Apply on the employer's site

Opens the official application on the employer’s site. No login required.