Mistral AI
Research Engineer, Data Infrastructure
Palo Altofull-timemidAdded today
About this role
Mistral AI seeks a Research Engineer to design and operate large-scale data infrastructure supporting AI model training and fine-tuning. You'll architect distributed compute and storage systems, manage multi-cluster orchestration, and ensure reliable operations for mission-critical training workloads in a fast-growing AI environment.
What you'll do
- Build and scale distributed compute and storage systems for massive AI training workloads
- Architect multi-cluster orchestration layers to optimize workload placement across hardware and regions
- Design modern storage infrastructure to handle fine-tuning datasets at exabyte scale
- Develop internal training platforms supporting seamless model training across Kubernetes and SLURM environments
- Implement metadata and lineage systems for visibility across complex data and model pipelines
- Participate in on-call rotations and ensure operational excellence for production training jobs
What they're looking for
- Data infrastructure or MLOps engineering
- Python programming
- Kubernetes and cloud-native tooling
- Distributed systems debugging and design
- Large-scale storage architecture
- Modern columnar storage formats
- Multi-cluster orchestration
- Deployment and CI/CD workflows
Opens the official application on the employer’s site. No login required.