AfterQuery

Software Engineer - RL Environments

San Francisco$180k–$220kfulltimemidAdded 2 days ago

About this role

AfterQuery is seeking a Software Engineer to design datasets and evaluation frameworks that directly influence how frontier AI models learn. You'll work with top AI labs to develop data collection strategies, build reward signals for reinforcement learning pipelines, and create metrics that measure model improvement across domains like finance and code.

What you'll do

Design data slices and strategies that expose meaningful model failure modes across finance, code, and enterprise domains
Build and refine evaluation rubrics and reward signals for RLHF and RLVR training pipelines
Model annotator behavior and run experiments to improve model capabilities
Develop quantitative frameworks measuring dataset quality, diversity, and impact on model alignment
Create and manage real-world and synthetic data pipelines
Partner with AI lab research teams to translate training objectives into concrete data specifications

What they're looking for

Data collection and curation
Evaluation framework design
Reinforcement learning fundamentals
Reward signal development
Experimental design and iteration
Quantitative analysis and metrics
Python or similar programming languages
Understanding of model training pipelines

Benefits

$200k base salary
Profit share (~150% of base)
Competitive equity
Based in San Francisco
Work directly with frontier AI labs
Impact on foundation model development

Apply on the employer's site →

Opens the official application on the employer’s site. No login required.