Klaviyo
Site Reliability Engineer
Boston, MAFrom $174kmidAdded 2 days ago
About this role
Klaviyo seeks a Site Reliability Engineer to design and build highly available, scalable systems while championing operational excellence across the engineering organization. You'll focus on eliminating bottlenecks, improving system performance, and collaborating with cross-functional teams to deliver reliable infrastructure supporting the platform's growth.
What you'll do
- Design systems and processes for high availability and scalability
- Identify and eliminate performance bottlenecks to improve throughput
- Participate in on-call rotation and resolve production issues quickly
- Perform quantitative analysis to understand and scale complex systems
- Advocate for preventative solutions with internal stakeholders and vendors
- Promote Site Reliability best practices across the Engineering organization
What they're looking for
- Python, Bash, and Shell scripting
- Linux administration and debugging (Ubuntu)
- AWS services (EC2, IAM, CloudWatch, CloudTrail, S3, Lambda)
- Infrastructure automation (CloudFormation, Terraform)
- Kubernetes and Docker containerization
- Monitoring and observability (Grafana, Prometheus, Filebeat, Logstash)
- Distributed systems design and operation
- Incident response and root cause analysis
Benefits
- Remote work flexibility (2 days per week telecommuting permitted)
- Competitive base salary ($131,082–$174,000 USD)
- Annual cash bonus plan
- Equity participation
- Comprehensive health, welfare, and wellbeing benefits
- Sign-on payments available
Opens the official application on the employer’s site. No login required.