Sustainable Talent

Site Reliability Engineer

Santa Clara, CAmidAdded today

About this role

Nvidia is seeking a Site Reliability Engineer for a contract role in Santa Clara, CA, focusing on infrastructure planning and process support. This position involves maintaining cloud services and ensuring the efficiency of automated tasks for various NVIDIA software teams.

What you'll do

Monitor and recover assets in a private cloud environment
Stabilize virtualization infrastructure (ESXi, KVM, Hyper-V)
Deploy and maintain machine farms using automation tools (Chef, Ansible, Terraform)
Provide on-call L1 support for infrastructure issues
Analyze and debug OS, networking, and performance problems
Assist in deploying infrastructure configurations for NVIDIA technologies

What they're looking for

KVM and ESXi virtualization
Configuration management (Chef, Ansible, Terraform)
Cloud services management
On-call support experience
Debugging and troubleshooting skills
Understanding of networking concepts
Experience with NVIDIA GPUs and Tegra Processors
Familiarity with performance monitoring tools

Benefits

Competitive pay ($65/hr - $85/hr)
Full benefits
Paid Time Off (PTO)
Supportive company culture
[unknown]
[unknown]

Apply on the employer's site →

Opens the official application on the employer’s site. No login required.