Firecrawl

Research Engineer – Evals

San Francisco, CA (Hybrid) OR Remote (Americas, UTC-3 to UTC-10) (Remote)$160k–$240kfulltimemidAdded 2 days ago

About this role

Design and build the evaluation infrastructure for Firecrawl's web data extraction platform, owning the metrics, pipelines, and datasets that measure output quality across millions of websites. You'll translate evaluation findings into training signals for models and RL systems, working at the intersection of rigorous measurement and practical product impact.

What you'll do

Design and implement evaluation metrics and pipelines for scrape, crawl, extract, and map operations across diverse web formats
Build benchmark datasets that represent real-world customer data distribution, including edge cases and difficult scenarios
Develop and validate LLM-as-judge systems for automated quality scoring, with human review tooling for edge cases
Integrate evals into CI/CD to catch regressions before production deployment
Work with RL and research engineers to convert quality measurements into reward signals and training feedback loops
Design and execute experiments to test hypotheses, communicating findings clearly to drive product and model decisions

What they're looking for

ML engineering and applied AI with production systems experience
LLM evaluation methodology and LLM-as-judge system design
Data quality assessment and unstructured data handling
Python and evaluation infrastructure development
Benchmark and dataset design with human labeling workflows
Metrics design and statistical rigor
CI/CD integration and automated testing
Clear technical communication and experimentation

Benefits

Salary: $160,000–$240,000/year (adjusted for location)
Equity: Up to 0.10%
Flexible location: San Francisco hybrid or remote (Americas, UTC-3 to UTC-10)
Full-time role at a fast-growing company (8-figure ARR, 120k+ GitHub stars)
Work on essential infrastructure for AI data extraction
Direct influence on model training and product decisions

Apply on the employer's site →

Opens the official application on the employer’s site. No login required.