Skip to main content

Together AI

Customer Success Engineer (CSE), GPU Cluster

San FranciscomidAdded today

About this role

The Customer Success Engineer (CSE) for GPU Clusters at Together AI will be the primary technical contact for key customer relationships, ensuring the operational effectiveness of large-scale GPU infrastructures. This role combines deep technical knowledge with customer partnership to drive success and growth for both the customer and the company.

What you'll do

  • Serve as the main technical contact for a dedicated strategic customer
  • Engage regularly through status reports and technical meetings
  • Translate customer feedback into actionable insights for various teams
  • Manage issues, escalations, and root cause analysis across infrastructure domains
  • Oversee hardware lifecycle management for GPU deployments
  • Develop observability strategies for customer infrastructure

What they're looking for

  • 5+ years in a customer-facing technical role
  • Expertise in GPU infrastructure and health diagnostics
  • Experience with Ethernet and InfiniBand architectures
  • Knowledge of enterprise storage systems
  • Understanding of data center operations and SLA management
  • Proficiency in monitoring tools like Prometheus and Grafana
  • Strong incident management and communication skills
  • Familiarity with Python or Bash

Benefits

  • Competitive compensation and startup equity
  • Health insurance
  • Flexibility for remote work
  • Generous salary range based on experience
  • Opportunity to work in a research-driven environment
  • Collaborative culture focused on AI innovation
Apply on the employer's site

Opens the official application on the employer’s site. No login required.