Skip to main content

primeintellect

Research Engineer - RL Infrastructure

San FranciscofulltimemidAdded 2 days ago

About this role

Prime Intellect seeks a Research Engineer to enhance their large-scale reinforcement learning (RL) infrastructure, focusing on optimizing performance, memory efficiency, and system throughput. The role involves collaborating with engineers and researchers to improve training systems and contribute to the architectural design for RL training.

What you'll do

  • Build and enhance systems for large-scale RL training
  • Optimize training efficiency across various layers
  • Implement performance optimizations
  • Work on distributed training systems
  • Contribute to open-source libraries
  • Collaborate on systems improvements

What they're looking for

  • Systems engineering in AI/ML
  • Experience with PyTorch and distributed frameworks
  • Performance optimization skills
  • Knowledge of large-scale training techniques
  • Understanding of GPU architecture
  • Ability to identify and resolve bottlenecks
  • Comfort in fast-paced environments
  • Experience with CUDA / Triton kernels (plus)

Benefits

  • Competitive cash compensation and equity
  • Flexible remote work options
  • Visa sponsorship and relocation support
  • Quarterly team events and learning opportunities
  • Work with a highly technical team
Apply on the employer's site

Opens the official application on the employer’s site. No login required.