elastic
Site Reliability Engineer (Hosted Infra) - Platform
United StatesFrom $210.6kmidAdded 2 days ago
About this role
Elastic seeks a Site Reliability Engineer to design and operate multi-cloud infrastructure powering Elastic Cloud across 70+ regions and thousands of hosts. You'll build automation tools, optimize system reliability, and strengthen observability while managing on-call responsibilities in a globally distributed team.
What you'll do
- Engineer software and internal tools to automate large-scale multi-cloud infrastructure
- Optimize host reliability and lifecycle management across multiple cloud providers
- Design and implement monitoring and alerting systems focused on incident prevention
- Scale global infrastructure and evolve management processes for growing demand
- Participate in on-call rotation, incident response, and postmortem analysis
- Contribute to code reviews and mentor team members
What they're looking for
- Golang software development
- Large-scale cloud infrastructure operations (100+ hosts)
- Linux systems administration and OS-level debugging
- Containerized workloads and production Kubernetes
- Infrastructure as Code tools (Terraform, Puppet, Ansible)
- Observability and monitoring tools (Prometheus, Graphite, Elastic Stack)
- Systems thinking and root cause analysis
- Clear technical documentation and communication
Benefits
- Base salary compensation (no variable component)
- Stock program participation
- Global team collaboration across multiple time zones
- Professional development and mentorship opportunities
Opens the official application on the employer’s site. No login required.