Skip to main content

Air Apps

Site Reliability Engineer (SRE)

Madrid (Remote)fulltimemidAdded 2 days ago

About this role

Air Apps seeks an experienced Site Reliability Engineer to ensure system reliability, availability, and scalability across cloud environments. You'll design fault-tolerant infrastructure, implement observability solutions, and automate deployments while collaborating with development teams in their Lisbon office.

What you'll do

  • Design and implement scalable, reliable systems across cloud environments with fault tolerance
  • Develop observability tools including monitoring, logging, and alerting solutions
  • Automate infrastructure provisioning and deployment using Infrastructure as Code tools
  • Optimize system performance and incident response workflows to improve uptime
  • Conduct root cause analysis and implement preventative failure measures
  • Participate in on-call rotations to address system failures and minimize downtime

What they're looking for

  • Site Reliability Engineering or DevOps (4+ years experience)
  • Cloud platforms (AWS, Azure, or GCP)
  • Observability tools (Prometheus, Grafana, ELK, Datadog)
  • Infrastructure as Code (Terraform, CloudFormation, Pulumi)
  • Containerization and orchestration (Docker, Kubernetes, Helm)
  • Scripting (Bash, Python, or Go)
  • Linux system administration and networking
  • Incident management and root cause analysis

Benefits

  • Apple hardware ecosystem
  • Annual bonus
  • Top-tier health and life insurance
  • Transportation budget
  • Coverflex benefits package
  • Childcare support
Apply on the employer's site

Opens the official application on the employer’s site. No login required.