Mercor
Research Engineer – Benchmarking, Evals & Failure Analysis
San Francisco$130k–$500kfulltimemidAdded 2 days ago
About this role
Mercor is seeking a Research Engineer to enhance AI models through benchmarking, evaluation systems, and failure analysis. This role involves collaborating with teams to define metrics and improve data quality in a fast-paced environment in San Francisco.
What you'll do
- Design and maintain benchmarking metrics for various AI behaviors
- Develop and manage evaluation systems for tracking model performance
- Conduct failure analysis on model outputs to identify improvement areas
- Create and refine rubrics and scoring frameworks for evaluations
- Assess data quality and impact on benchmarks to guide data strategies
- Collaborate with teams to align evaluations with training goals
What they're looking for
- Background in applied research and model evaluation
- Strong coding skills related to ML models
- Familiar with data structures and algorithms
- Experience with APIs and SQL/NoSQL databases
- Ability to analyze model behavior and evaluate data quality
- Willingness to work in-office in a dynamic setting
Benefits
- Bi-annual performance bonuses
- Equity grant vested over 4 years
- Relocation bonuses up to $15k
- Housing bonuses for nearby residents
- $1.5k monthly meal stipend
- Free Equinox membership
Opens the official application on the employer’s site. No login required.