Skip to main content

Klaviyo

Site Reliability Engineer

Boston, MAFrom $174kmidAdded 2 days ago

About this role

Klaviyo seeks a Site Reliability Engineer to design and build highly available, scalable systems while championing operational excellence across the engineering organization. You'll focus on eliminating bottlenecks, improving system performance, and collaborating with cross-functional teams to deliver reliable infrastructure supporting the platform's growth.

What you'll do

  • Design systems and processes for high availability and scalability
  • Identify and eliminate performance bottlenecks to improve throughput
  • Participate in on-call rotation and resolve production issues quickly
  • Perform quantitative analysis to understand and scale complex systems
  • Advocate for preventative solutions with internal stakeholders and vendors
  • Promote Site Reliability best practices across the Engineering organization

What they're looking for

  • Python, Bash, and Shell scripting
  • Linux administration and debugging (Ubuntu)
  • AWS services (EC2, IAM, CloudWatch, CloudTrail, S3, Lambda)
  • Infrastructure automation (CloudFormation, Terraform)
  • Kubernetes and Docker containerization
  • Monitoring and observability (Grafana, Prometheus, Filebeat, Logstash)
  • Distributed systems design and operation
  • Incident response and root cause analysis

Benefits

  • Remote work flexibility (2 days per week telecommuting permitted)
  • Competitive base salary ($131,082–$174,000 USD)
  • Annual cash bonus plan
  • Equity participation
  • Comprehensive health, welfare, and wellbeing benefits
  • Sign-on payments available
Apply on the employer's site

Opens the official application on the employer’s site. No login required.