Klaviyo

Site Reliability Engineer

Boston, MAFrom $174kmidAdded 2 days ago

About this role

Klaviyo seeks a Site Reliability Engineer to design and build highly available, scalable systems while championing operational excellence across the engineering organization. You'll focus on eliminating bottlenecks, improving system performance, and collaborating with cross-functional teams to deliver reliable infrastructure supporting the platform's growth.

What you'll do

Design systems and processes for high availability and scalability
Identify and eliminate performance bottlenecks to improve throughput
Participate in on-call rotation and resolve production issues quickly
Perform quantitative analysis to understand and scale complex systems
Advocate for preventative solutions with internal stakeholders and vendors
Promote Site Reliability best practices across the Engineering organization

What they're looking for

Python, Bash, and Shell scripting
Linux administration and debugging (Ubuntu)
AWS services (EC2, IAM, CloudWatch, CloudTrail, S3, Lambda)
Infrastructure automation (CloudFormation, Terraform)
Kubernetes and Docker containerization
Monitoring and observability (Grafana, Prometheus, Filebeat, Logstash)
Distributed systems design and operation
Incident response and root cause analysis

Benefits

Remote work flexibility (2 days per week telecommuting permitted)
Competitive base salary ($131,082–$174,000 USD)
Annual cash bonus plan
Equity participation
Comprehensive health, welfare, and wellbeing benefits
Sign-on payments available

Apply on the employer's site →

Opens the official application on the employer’s site. No login required.