openai
Systems Engineer (Network / Storage / Systems)
San Franciscofulltimemid
About this role
Join OpenAI's Stargate team to engineer the physical and logical infrastructure powering large-scale AI systems. You'll architect and operationalize networking, storage, and compute platforms that enable frontier model training across global deployments, partnering with hardware and software teams to bring new systems into production at scale.
What you'll do
- Own system engineering across networking, storage, validation, and hardware bring-up workstreams
- Design and optimize network architectures including frontend, WAN, and out-of-band infrastructure
- Define storage architectures across rack, pod, cluster, and cloud tiers for performance and cost efficiency
- Lead hardware platform bring-up including imaging, provisioning, validation, and production readiness
- Debug complex system faults across firmware, NIC, GPU, and server layers with root cause analysis
- Build automation and tools to improve lab operations, fleet readiness, and deployment velocity
What they're looking for
- 7+ years systems or infrastructure engineering experience
- Deep expertise in networking, storage systems, server platforms, or Linux systems
- Hardware bring-up and cluster deployment experience
- Low-level hardware/software debugging and cross-functional troubleshooting
- Python, Go, Bash, or similar scripting languages
- Hyperscale, AI cluster, HPC, or data center infrastructure knowledge
- Experience with OEM/ODM/JDM vendor management
- GPU cluster or accelerator infrastructure experience (preferred)
Benefits
- Hybrid work model: 3 days per week in San Francisco office
- Relocation assistance available
Opens the official application on the employer’s site. No login required.