Skip to main content

elastic

Site Reliability Engineer (Hosted Infra) - Platform

United StatesFrom $210.6kmidAdded 2 days ago

About this role

Elastic seeks a Site Reliability Engineer to design and operate multi-cloud infrastructure powering Elastic Cloud across 70+ regions and thousands of hosts. You'll build automation tools, optimize system reliability, and strengthen observability while managing on-call responsibilities in a globally distributed team.

What you'll do

  • Engineer software and internal tools to automate large-scale multi-cloud infrastructure
  • Optimize host reliability and lifecycle management across multiple cloud providers
  • Design and implement monitoring and alerting systems focused on incident prevention
  • Scale global infrastructure and evolve management processes for growing demand
  • Participate in on-call rotation, incident response, and postmortem analysis
  • Contribute to code reviews and mentor team members

What they're looking for

  • Golang software development
  • Large-scale cloud infrastructure operations (100+ hosts)
  • Linux systems administration and OS-level debugging
  • Containerized workloads and production Kubernetes
  • Infrastructure as Code tools (Terraform, Puppet, Ansible)
  • Observability and monitoring tools (Prometheus, Graphite, Elastic Stack)
  • Systems thinking and root cause analysis
  • Clear technical documentation and communication

Benefits

  • Base salary compensation (no variable component)
  • Stock program participation
  • Global team collaboration across multiple time zones
  • Professional development and mentorship opportunities
Apply on the employer's site

Opens the official application on the employer’s site. No login required.