AfterQuery
Software Engineer - RL Environments
San Francisco$180k–$220kfulltimemidAdded 2 days ago
About this role
AfterQuery is seeking a Software Engineer to design datasets and evaluation frameworks that directly influence how frontier AI models learn. You'll work with top AI labs to develop data collection strategies, build reward signals for reinforcement learning pipelines, and create metrics that measure model improvement across domains like finance and code.
What you'll do
- Design data slices and strategies that expose meaningful model failure modes across finance, code, and enterprise domains
- Build and refine evaluation rubrics and reward signals for RLHF and RLVR training pipelines
- Model annotator behavior and run experiments to improve model capabilities
- Develop quantitative frameworks measuring dataset quality, diversity, and impact on model alignment
- Create and manage real-world and synthetic data pipelines
- Partner with AI lab research teams to translate training objectives into concrete data specifications
What they're looking for
- Data collection and curation
- Evaluation framework design
- Reinforcement learning fundamentals
- Reward signal development
- Experimental design and iteration
- Quantitative analysis and metrics
- Python or similar programming languages
- Understanding of model training pipelines
Benefits
- $200k base salary
- Profit share (~150% of base)
- Competitive equity
- Based in San Francisco
- Work directly with frontier AI labs
- Impact on foundation model development
Opens the official application on the employer’s site. No login required.