Distinguished Engineer – GPU Fleet Operations Automation

🕒 January 2

Apply Now
Find Similar Remote Jobs

📊 Check your resume score for this job

Improve your chances of getting an interview by checking your resume score before you apply.

Logo of NVIDIA

NVIDIA

10,000+ employees

Founded 1993

🤖 Artificial Intelligence

🎮 Gaming

Artificial Intelligence • Gaming • Automotive

NVIDIA is a leading technology company specializing in accelerated computing and artificial intelligence. NVIDIA pioneers advancements in graphical processing units (GPUs), cloud computing, data centers, and virtual reality, with a focus on gaming, automotive, healthcare, and robotics industries. The company's innovations, such as NVIDIA Omniverse, transform traditional digital processes by enabling high-fidelity simulations and rendering tasks. Their applications span various industries, from autonomous vehicles using NVIDIA DRIVE to healthcare solutions with NVIDIA Clara, and AI-driven analytics and workflows.

📋 Description

• Various Architectural Work: define and drive the technical implementation for DGX Cloud operations practice for GPU fleet lifecycle. • Collaborate on Cross Domain Disciplines: drive the technical strategy and awareness for best practices and technical capabilities into DGX Cloud engineering practices. • Accelerate Integration: Guide the technical delivery into DGX Cloud across all delivery environments: enterprise, public cloud, and high security, isolated, sovereign. • Engage Stakeholders: Collaborate with customers, infrastructure providers, and partners to ensure NVIDIA’s solutions set the industry standard for operational excellence. • Full Software and System Lifecycle: From ideation to architecture, design, development, deployment, operations, and full lifecycle management, lead all technical aspects of planning and continuous evolution of large technical scope.

🎯 Requirements

• 15-18+ overall years in technical roles with a focus on operations and automation for cloud infrastructure, platforms, and applications. • 5-10+ years of lead experience • BS/MS or higher or equivalent experience in systems / software engineering, or related engineering fields • Technical proficiency in multi-tenant data center and cloud-native architectures, with bare metal, virtualization, containerization, and higher level abstractions (IaaS, Kubernetes, Slurm), AI/ML platforms and applications. • Shown success delivering high-impact technically complex solutions that achieve high levels of transparency into resource utilization, performance, and operational insights. • Technical Leadership: Ability to synthesize multi-functional needs into architecture and design while guiding internal execution across complementary teams. • Communication and Partnership: Strong collaboration and influence skills, capable of leading engineering engagement, presenting with peers, partners, and working with high performance accelerated computing customers.

🏖️ Benefits

• equity • benefits

Apply Now

Similar Jobs

🕒 December 22, 2025

State of Florida

10,000+ employees

🏛️ Government

📚 Education

QSI Assessor conducting evidence-based assessments for persons with disabilities at the Agency for Persons with Disabilities. Responsible for interviews and assessments to support developmental disability services.

🇺🇸 United States – Remote

💵 $22 / hour

⏰ Full Time

🟡 Mid-level

🟠 Senior

⚙️ Operations

🕒 December 19, 2025

State of Florida

10,000+ employees

🏛️ Government

📚 Education

QSI Assessor conducting evidence-based assessments for the Agency for Persons with Disabilities. Engaging directly with clients and using online systems to record results and referrals.

🇺🇸 United States – Remote

💵 $22 / hour

⏰ Full Time

🟡 Mid-level

🟠 Senior

⚙️ Operations

🕒 December 17, 2025

CertifyOS

51 - 200

⚕️ Healthcare Insurance

☁️ SaaS

📋 Compliance

Operations Analyst managing provider credentials, licenses, and payor enrollment for healthcare efficiency. Conducting research and collaborating across teams to ensure smooth operations.

🇺🇸 United States – Remote

💵 $60k - $80k / year

💰 $14.5M Series A on 2022-09

⏰ Full Time

🟡 Mid-level

🟠 Senior

⚙️ Operations

🕒 December 17, 2025

Astreya

1001 - 5000

🔒 Cybersecurity

🏢 Enterprise

☁️ SaaS

Operations Management Supervisor managing logistics operations and resources at Astreya. Focusing on efficiency, compliance, and team leadership in logistics activities.

🇺🇸 United States – Remote

💵 $55.4k - $104.4k / year

⏰ Full Time

🟡 Mid-level

🟠 Senior

⚙️ Operations

🦅 H1B Visa Sponsor

info

🕒 December 16, 2025

Haldren

51 - 200

🎯 Recruiter

🤝 B2B

Senior Operations Manager at Haldren coordinating strategy and performance across operational functions. Leading a high-performing team while ensuring compliance with standards and policies.

🇺🇸 United States – Remote

💵 $150k - $195k / year

⏰ Full Time

🟠 Senior

⚙️ Operations