
Artificial Intelligence • Gaming • Automotive
NVIDIA is a leading technology company specializing in accelerated computing and artificial intelligence. NVIDIA pioneers advancements in graphical processing units (GPUs), cloud computing, data centers, and virtual reality, with a focus on gaming, automotive, healthcare, and robotics industries. The company's innovations, such as NVIDIA Omniverse, transform traditional digital processes by enabling high-fidelity simulations and rendering tasks. Their applications span various industries, from autonomous vehicles using NVIDIA DRIVE to healthcare solutions with NVIDIA Clara, and AI-driven analytics and workflows.
August 22
🏄 California – Remote
💵 $168k - $333.5k / year
⏰ Full Time
🟠 Senior
⛑ DevOps & Site Reliability Engineer (SRE)
🦅 H1B Visa Sponsor

Artificial Intelligence • Gaming • Automotive
NVIDIA is a leading technology company specializing in accelerated computing and artificial intelligence. NVIDIA pioneers advancements in graphical processing units (GPUs), cloud computing, data centers, and virtual reality, with a focus on gaming, automotive, healthcare, and robotics industries. The company's innovations, such as NVIDIA Omniverse, transform traditional digital processes by enabling high-fidelity simulations and rendering tasks. Their applications span various industries, from autonomous vehicles using NVIDIA DRIVE to healthcare solutions with NVIDIA Clara, and AI-driven analytics and workflows.
• Design, implement and support operational and reliability aspects of large scale Observability & Telemetry collection platform with a focus on performance at scale, real time monitoring, logging and alerting • Engage in and improve the whole lifecycle of services—from inception and design through deployment, operation and refinement • Maintain services once they are live by measuring and monitoring availability, latency and overall system health • Scale systems sustainably through mechanisms like automation, and evolve systems by pushing for changes that improve reliability and velocity • Practice sustainable incident response and blameless postmortems • Be part of an on call rotation to support production systems
• BS degree in Computer Science or a related technical field involving coding (e.g., physics or mathematics), or equivalent experience • 5+ years of experience with Infrastructure automation, distributed systems design, experience with design, develop tools for running large scale private or public cloud system in Production • 8+ years experience delivering foundational infrastructure and observability platforms. • Experience in one or more of the following: Python, Go, Perl or Ruby • In depth knowledge on Linux, Networking and Containers
• Equity and benefits
Apply NowAugust 20
Salesforce DevOps Architect providing leadership for multiple Salesforce teams. Managing CI/CD pipelines and enforcing development standards in a remote role.
Cloud
August 20
Senior SRE building scalable, secure infra for AI compute at TensorWave. Designs low-level systems and automates infrastructure.
Cloud
JavaScript
Kubernetes
Linux
Rust
Spring
Terraform
Go
August 20
Deployment Engineer at Atolio: ensure secure, scalable deployments of enterprise search across environments; build automation and collaborate with success teams.
AWS
Azure
Cloud
Distributed Systems
Google Cloud Platform
Grafana
Kubernetes
Python
ServiceNow
Splunk
Terraform
Go
August 19
Senior DevOps Engineer at Syniti builds CI/CD pipelines and cloud automation; mentors teams and optimizes DevOps practices for scalable data platform.
🇺🇸 United States – Remote
💰 Private Equity Round on 2017-08
⏰ Full Time
🟠 Senior
⛑ DevOps & Site Reliability Engineer (SRE)
🦅 H1B Visa Sponsor
AWS
Cloud
Docker
Jenkins
Kubernetes
Python
Terraform
Go
August 19
Lead global SRE team at Syniti, ensuring compliant, scalable SaaS platforms; drive IaC, observability, and security across AWS, Azure, and on-prem. Mentor engineers and align with zero-trust principles.
🇺🇸 United States – Remote
💰 Private Equity Round on 2017-08
⏰ Full Time
🟠 Senior
⛑ DevOps & Site Reliability Engineer (SRE)
🦅 H1B Visa Sponsor
AWS
Azure
Cloud
Kubernetes
Python
Terraform
Go