openai
Research Engineer/Scientist - Human Alignment, Consumer Devices
San Francisco (Remote)fulltimemid
About this role
Join OpenAI's Future of Computing Research team to develop RLHF and post-training methods for personalized, multimodal AI systems. You'll build reward models, evaluation frameworks, and learning foundations that help AI systems adapt to individual users over time while maintaining alignment and trustworthiness in real-world consumer device applications.
What you'll do
- Develop RLHF and post-training methods for multimodal models with focus on personalization
- Build reward models and preference-learning pipelines for adaptive model behavior
- Design datasets, rubrics, and evaluation frameworks capturing user preferences and long-term value
- Run experiments on policy improvement using explicit feedback and implicit signals
- Tackle long-horizon evaluation problems where model quality depends on sustained behavioral improvements
- Collaborate with safety researchers to ensure personalization remains aligned and interpretable
What they're looking for
- RLHF and reward modeling
- Preference optimization and post-training methods
- Reinforcement learning and policy improvement
- Experimental design and rigorous empirical evaluation
- Dataset creation and human preference annotation
- Recommender systems or personalization experience
- Multimodal AI systems
- Cross-functional collaboration across research, engineering, and product
Benefits
- Hybrid work model (4 days in-office per week in San Francisco)
- Relocation assistance for new employees
- Work on frontier research with real product impact
- Collaboration with top researchers, engineers, designers, and safety experts
Opens the official application on the employer’s site. No login required.