openai

Research Engineer/Scientist - Human Alignment, Consumer Devices

San Francisco (Remote)fulltimemid

About this role

Join OpenAI's Future of Computing Research team to develop RLHF and post-training methods for personalized, multimodal AI systems. You'll build reward models, evaluation frameworks, and learning foundations that help AI systems adapt to individual users over time while maintaining alignment and trustworthiness in real-world consumer device applications.

What you'll do

Develop RLHF and post-training methods for multimodal models with focus on personalization
Build reward models and preference-learning pipelines for adaptive model behavior
Design datasets, rubrics, and evaluation frameworks capturing user preferences and long-term value
Run experiments on policy improvement using explicit feedback and implicit signals
Tackle long-horizon evaluation problems where model quality depends on sustained behavioral improvements
Collaborate with safety researchers to ensure personalization remains aligned and interpretable

What they're looking for

RLHF and reward modeling
Preference optimization and post-training methods
Reinforcement learning and policy improvement
Experimental design and rigorous empirical evaluation
Dataset creation and human preference annotation
Recommender systems or personalization experience
Multimodal AI systems
Cross-functional collaboration across research, engineering, and product

Benefits

Hybrid work model (4 days in-office per week in San Francisco)
Relocation assistance for new employees
Work on frontier research with real product impact
Collaboration with top researchers, engineers, designers, and safety experts

Apply on the employer's site →

Opens the official application on the employer’s site. No login required.