Sierra
Software Engineer, Site Reliability (SRE)
San Francisco, CA$230k–$390kfulltimemidAdded 2 days ago
About this role
Sierra is seeking a Software Engineer for their Site Reliability team to enhance the reliability and scalability of their AI-driven infrastructure. The role involves overseeing observability systems, designing secure cloud infrastructure, and collaborating with engineering teams to ensure high availability and performance.
What you'll do
- Manage observability stack for system health monitoring
- Collaborate on designing reliable and scalable systems
- Implement cloud infrastructure using Terraform and DevOps tools
- Enhance reliability of LLM deployments for effective operations
- Optimize deployment pipelines and incident management processes
- Establish SRE practices and influence engineering culture
What they're looking for
- 5+ years in Site Reliability or Infrastructure engineering
- Expertise in Terraform and AWS services
- Strong in observability tools like Prometheus and Grafana
- Experience designing for system availability and scalability
- Familiarity with container orchestration and networking
- Ability to work collaboratively in dynamic environments
- Degree in Computer Science or related field
Benefits
- Innovative work environment
- Collaborative team culture
- Strong focus on professional growth
- Opportunity to work with advanced AI technologies
- In-person collaboration in San Francisco and other global offices
- [unknown]
Opens the official application on the employer’s site. No login required.