Mistral AI
Site Reliability Engineer - NYC
New Yorkfull-timemidAdded 2 days ago
About this role
Mistral AI is looking for a skilled Site Reliability Engineer to enhance the reliability and performance of its AI platform and applications. The role involves collaborating with engineers and researchers to maintain scalable infrastructures while driving improvements in automation and system monitoring.
What you'll do
- Design and maintain scalable, fault-tolerant infrastructures
- Ensure high availability of platforms and ML workloads
- Troubleshoot production systems and respond to incidents
- Implement monitoring and incident response systems
- Drive automation and orchestration improvements
- Collaborate with security teams to ensure compliance
What they're looking for
- Master’s degree in Computer Science or related field
- 7+ years in DevOps/SRE roles
- Experience with cloud computing and distributed systems
- CI/CD and containerization tools like Docker and Kubernetes
- Knowledge of monitoring and observability tools
- Proficiency in scripting languages
- Strong problem-solving skills
- Self-motivated in fast-paced environments
Opens the official application on the employer’s site. No login required.