elastic

Site Reliability Engineer (Hosted Infra) - Platform

United StatesFrom $210.6kmidAdded 2 days ago

About this role

Elastic seeks a Site Reliability Engineer to design and operate multi-cloud infrastructure powering Elastic Cloud across 70+ regions and thousands of hosts. You'll build automation tools, optimize system reliability, and strengthen observability while managing on-call responsibilities in a globally distributed team.

What you'll do

Engineer software and internal tools to automate large-scale multi-cloud infrastructure
Optimize host reliability and lifecycle management across multiple cloud providers
Design and implement monitoring and alerting systems focused on incident prevention
Scale global infrastructure and evolve management processes for growing demand
Participate in on-call rotation, incident response, and postmortem analysis
Contribute to code reviews and mentor team members

What they're looking for

Golang software development
Large-scale cloud infrastructure operations (100+ hosts)
Linux systems administration and OS-level debugging
Containerized workloads and production Kubernetes
Infrastructure as Code tools (Terraform, Puppet, Ansible)
Observability and monitoring tools (Prometheus, Graphite, Elastic Stack)
Systems thinking and root cause analysis
Clear technical documentation and communication

Benefits

Base salary compensation (no variable component)
Stock program participation
Global team collaboration across multiple time zones
Professional development and mentorship opportunities

Apply on the employer's site →

Opens the official application on the employer’s site. No login required.