primeintellect

Research Engineer - RL Infrastructure

San FranciscofulltimemidAdded 2 days ago

About this role

Prime Intellect seeks a Research Engineer to enhance their large-scale reinforcement learning (RL) infrastructure, focusing on optimizing performance, memory efficiency, and system throughput. The role involves collaborating with engineers and researchers to improve training systems and contribute to the architectural design for RL training.

What you'll do

Build and enhance systems for large-scale RL training
Optimize training efficiency across various layers
Implement performance optimizations
Work on distributed training systems
Contribute to open-source libraries
Collaborate on systems improvements

What they're looking for

Systems engineering in AI/ML
Experience with PyTorch and distributed frameworks
Performance optimization skills
Knowledge of large-scale training techniques
Understanding of GPU architecture
Ability to identify and resolve bottlenecks
Comfort in fast-paced environments
Experience with CUDA / Triton kernels (plus)

Benefits

Competitive cash compensation and equity
Flexible remote work options
Visa sponsorship and relocation support
Quarterly team events and learning opportunities
Work with a highly technical team

Apply on the employer's site →

Opens the official application on the employer’s site. No login required.