Chime
Software Engineer, Machine Learning Platform
San Francisco, CA, USAmidAdded 2 days ago
About this role
Chime seeks an ML Platform Engineer to design and operate scalable machine learning infrastructure on AWS, enabling data scientists and ML engineers to develop, train, deploy, and monitor models efficiently. You'll work on distributed systems, feature stores, data pipelines, and CI/CD workflows while partnering with cross-functional teams to improve developer experience and platform reliability.
What you'll do
- Design and operate scalable ML infrastructure and distributed training systems on AWS using Ray
- Build and maintain infrastructure-as-code with Terraform and manage cloud resources
- Develop feature stores, data ingestion pipelines, and streaming systems using technologies like Kafka and Kinesis
- Improve CI/CD workflows, observability, and cost efficiency across ML workloads
- Collaborate with Data Science and ML Engineering teams to enhance developer experience and platform architecture
- Participate in on-call rotations to support production ML systems
What they're looking for
- ML infrastructure and platform engineering (5+ years)
- Distributed systems and large-scale data processing
- AWS cloud computing and infrastructure-as-code (Terraform)
- Docker, Kubernetes, and container orchestration
- Python, Go, Scala, or Java programming
- CI/CD pipelines and DevOps practices
- Distributed computing frameworks (Spark, Ray)
- GPU programming (CUDA) and optimization
Benefits
- Competitive base salary ($187,000–$259,000) plus bonus
- Equity package
- Comprehensive benefits package
- Work on cutting-edge ML infrastructure and AI technologies
- Collaborative team environment with diverse perspectives
- Opportunity to shape platform architecture and technical roadmap
Opens the official application on the employer’s site. No login required.