TensorWave
Software Engineer
Remote (Remote)fulltimemidAdded 2 days ago
About this role
TensorWave is seeking a Senior Software Engineer to enhance their platform team, focusing on automating large-scale GPU clusters across various environments. The ideal candidate will drive the development of tooling and pipelines while collaborating with cross-functional teams to achieve organizational goals.
What you'll do
- Automate provisioning of bare metal GPU clusters
- Manage lifecycle of Slurm and Kubernetes clusters
- Develop GPU node configuration infrastructure
- Establish cluster validation pipelines and health checks
- Implement day-2 operations automation processes
- Maintain documentation and runbooks for operations
What they're looking for
- 5+ years in infrastructure/platform engineering
- 3+ years of production experience with Go
- Kubernetes internals knowledge
- Building Kubernetes Operators
- Experience with gRPC and REST APIs
- Familiarity with bare metal infrastructure
- Strong testing practices
- Observability stack ownership
Benefits
- Stock Options
- 100% paid Medical, Dental, and Vision insurance
- Health Savings Account contributions
- 100% paid Short and Long Term Disability insurance
- Flexible Spending Account
- Flexible PTO and Paid Holidays
Opens the official application on the employer’s site. No login required.