Air Apps
Site Reliability Engineer (SRE)
Madrid (Remote)fulltimemidAdded 2 days ago
About this role
Air Apps seeks an experienced Site Reliability Engineer to ensure system reliability, availability, and scalability across cloud environments. You'll design fault-tolerant infrastructure, implement observability solutions, and automate deployments while collaborating with development teams in their Lisbon office.
What you'll do
- Design and implement scalable, reliable systems across cloud environments with fault tolerance
- Develop observability tools including monitoring, logging, and alerting solutions
- Automate infrastructure provisioning and deployment using Infrastructure as Code tools
- Optimize system performance and incident response workflows to improve uptime
- Conduct root cause analysis and implement preventative failure measures
- Participate in on-call rotations to address system failures and minimize downtime
What they're looking for
- Site Reliability Engineering or DevOps (4+ years experience)
- Cloud platforms (AWS, Azure, or GCP)
- Observability tools (Prometheus, Grafana, ELK, Datadog)
- Infrastructure as Code (Terraform, CloudFormation, Pulumi)
- Containerization and orchestration (Docker, Kubernetes, Helm)
- Scripting (Bash, Python, or Go)
- Linux system administration and networking
- Incident management and root cause analysis
Benefits
- Apple hardware ecosystem
- Annual bonus
- Top-tier health and life insurance
- Transportation budget
- Coverflex benefits package
- Childcare support
Opens the official application on the employer’s site. No login required.