Foxglove

ML Platform Engineer

San Francisco, CA$183k–$310kfulltimemidAdded 2 days ago

About this role

Foxglove seeks an ML Platform Engineer to design and operate production infrastructure for a robotics data platform handling petabyte-scale multimodal data. You'll own inference serving, embedding pipelines, training infrastructure, and cloud systems that enable rapid ML iteration across distributed robotics deployments.

What you'll do

Design and operate production inference infrastructure including model serving, autoscaling, and cost optimization
Own embedding and retrieval pipeline architecture for multimodal robotics data (images, video, point clouds, timeseries)
Build training and evaluation infrastructure with job orchestration, experiment tracking, and dataset versioning
Make cloud infrastructure decisions (AWS/GCP) balancing latency, throughput, reliability, and cost
Create platform abstractions enabling product engineers to ship ML features without infrastructure management
Evaluate and integrate third-party ML infrastructure components with clear build vs. buy frameworks

What they're looking for

Production ML infrastructure and model serving (vLLM, Triton, TorchServe)
Distributed systems and cloud infrastructure (AWS/GCP)
Vector databases and retrieval systems at scale
Model optimization and cloud cost management
ML orchestration and experiment tracking
System reliability and operational design
Communication across ML and engineering teams
Fine-tuning and domain adaptation for LLMs (bonus)

Benefits

$300 monthly commuter/workspace budget
Competitive equity in Series B company
100% medical, dental, vision, and life insurance for employees; 75% for dependents
401(k) matching up to 4%
4 weeks vacation plus holidays and winter break
All-expenses-paid company off-sites twice yearly

Apply on the employer's site →

Opens the official application on the employer’s site. No login required.