Cartesia

Research Engineer, Data

*HQ - San Francisco, CA$200k–$350kfulltimemidAdded 2 days ago

About this role

Cartesia is seeking a Research Engineer to lead data strategy for multilingual AI model training. You'll design large-scale datasets, build evaluation systems, and ensure models perform well across dozens of languages while mitigating bias and improving data quality.

What you'll do

Design and build large-scale multilingual datasets for model training with controlled experiments to measure performance impact
Develop evaluation frameworks for speech models using both manual annotation and automated metrics at scale
Implement data steering techniques to improve model intelligence and reduce bias
Build automated quality control systems to validate and filter generated data
Partner with product teams to ensure support for key languages and markets
Guide human annotation and evaluation processes across multiple languages

What they're looking for

Experience with large multilingual datasets
Generative models (speech, text, or multimodal)
Applied machine learning with data-centric focus
Human annotation and evaluation guidance
Scalable systems design bridging research and production
Linguistic nuance and cultural understanding
Data quality and bias mitigation
Python or similar programming languages

Benefits

Competitive base salary with equity package
Fully covered medical, dental, and vision insurance for you and family
Parental leave (12 weeks maternity, 9 weeks paternity)
401(k), commuter allowance, flexible PTO, and daily meals/snacks
In-person collaboration in San Francisco, London, or Bangalore offices
Visa sponsorship support available

Apply on the employer's site →

Opens the official application on the employer’s site. No login required.