primeintellect
Research Engineer - RL Infrastructure
San FranciscofulltimemidAdded 2 days ago
About this role
Prime Intellect seeks a Research Engineer to enhance their large-scale reinforcement learning (RL) infrastructure, focusing on optimizing performance, memory efficiency, and system throughput. The role involves collaborating with engineers and researchers to improve training systems and contribute to the architectural design for RL training.
What you'll do
- Build and enhance systems for large-scale RL training
- Optimize training efficiency across various layers
- Implement performance optimizations
- Work on distributed training systems
- Contribute to open-source libraries
- Collaborate on systems improvements
What they're looking for
- Systems engineering in AI/ML
- Experience with PyTorch and distributed frameworks
- Performance optimization skills
- Knowledge of large-scale training techniques
- Understanding of GPU architecture
- Ability to identify and resolve bottlenecks
- Comfort in fast-paced environments
- Experience with CUDA / Triton kernels (plus)
Benefits
- Competitive cash compensation and equity
- Flexible remote work options
- Visa sponsorship and relocation support
- Quarterly team events and learning opportunities
- Work with a highly technical team
Opens the official application on the employer’s site. No login required.