Skip to main content

Mercor

Research Engineer – Benchmarking, Evals & Failure Analysis

San Francisco$130k–$500kfulltimemidAdded 2 days ago

About this role

Mercor is seeking a Research Engineer to enhance AI models through benchmarking, evaluation systems, and failure analysis. This role involves collaborating with teams to define metrics and improve data quality in a fast-paced environment in San Francisco.

What you'll do

  • Design and maintain benchmarking metrics for various AI behaviors
  • Develop and manage evaluation systems for tracking model performance
  • Conduct failure analysis on model outputs to identify improvement areas
  • Create and refine rubrics and scoring frameworks for evaluations
  • Assess data quality and impact on benchmarks to guide data strategies
  • Collaborate with teams to align evaluations with training goals

What they're looking for

  • Background in applied research and model evaluation
  • Strong coding skills related to ML models
  • Familiar with data structures and algorithms
  • Experience with APIs and SQL/NoSQL databases
  • Ability to analyze model behavior and evaluate data quality
  • Willingness to work in-office in a dynamic setting

Benefits

  • Bi-annual performance bonuses
  • Equity grant vested over 4 years
  • Relocation bonuses up to $15k
  • Housing bonuses for nearby residents
  • $1.5k monthly meal stipend
  • Free Equinox membership
Apply on the employer's site

Opens the official application on the employer’s site. No login required.