Skip to main content

Character.AI

Research Engineer, AI Safety & Alignment

Redwood City, CA (Remote)$225k–$400kfulltimemidAdded 2 days ago

About this role

Develop and implement AI safety techniques to make large language models more reliable, honest, and aligned with human values. This role combines cutting-edge research in model alignment and interpretability with practical engineering to protect millions of users.

What you'll do

  • Design evaluation methodologies and metrics to assess safety and alignment of large language models
  • Research and develop alignment techniques including value learning, interpretability, and RLHF
  • Conduct adversarial testing to identify vulnerabilities and failure modes
  • Mitigate biases, toxicity, and harmful behaviors in models through fine-tuning and reinforcement learning
  • Translate safety research into scalable solutions with engineering and product teams
  • Publish findings and contribute to AI safety research community

What they're looking for

  • Machine learning and transformers architecture
  • Reinforcement learning from human feedback (RLHF)
  • Production code development (Python or similar)
  • GPU training, serving, and debugging
  • Data pipelines and infrastructure
  • Model interpretability and explainability
  • Adversarial testing and vulnerability assessment
  • Distributed model training

Benefits

  • Work on critical AI safety challenges affecting millions of users
  • Contribute to academic publications and present at conferences
  • Collaborate with leading AI research team at a unicorn AI company
  • Located in Redwood City, CA
  • Opportunity to shape responsible AI development
Apply on the employer's site

Opens the official application on the employer’s site. No login required.