Anthropic
Software Engineer, Safeguards
San Francisco, CA | New York City, NYFrom $485kmidAdded 2 days ago
About this role
Anthropic is seeking a Software Engineer for their Safeguards team to build safety and oversight systems for AI models. You'll develop monitoring tools, detection mechanisms, and multi-layered defenses to prevent misuse and ensure user protection across their API platforms.
What you'll do
- Develop monitoring systems to detect unwanted API partner behaviors and create automated enforcement actions
- Build abuse detection mechanisms and infrastructure for AI systems
- Surface abuse patterns to research teams to improve model training and hardening
- Create robust, real-time, scalable safety defense systems
- Surface findings in internal dashboards for analyst review
What they're looking for
- Python and TypeScript proficiency
- Full-stack software development
- Abuse and fraud detection systems
- Technical communication with non-technical stakeholders
- AI/ML trust and safety mechanisms
- Prompt engineering and adversarial input understanding
- Internal tooling and operational systems
Benefits
- Annual salary: $320,000–$485,000 USD
- Hybrid work policy with 25% minimum office time
- Visa sponsorship available
- Work on AI safety and beneficial AI systems
- Collaborative team of researchers and engineers
Opens the official application on the employer’s site. No login required.