Deep Learning Compiler – CI/Infrastructure Engineer

🔥 1 minute ago

Apply Now
Find Similar Remote Jobs

📊 Check your resume score for this job

Improve your chances of getting an interview by checking your resume score before you apply.

Logo of NVIDIA

NVIDIA

10,000+ employees

Founded 1993

🤖 Artificial Intelligence

🎮 Gaming

Artificial Intelligence • Gaming • Automotive

NVIDIA is a leading technology company specializing in accelerated computing and artificial intelligence. NVIDIA pioneers advancements in graphical processing units (GPUs), cloud computing, data centers, and virtual reality, with a focus on gaming, automotive, healthcare, and robotics industries. The company's innovations, such as NVIDIA Omniverse, transform traditional digital processes by enabling high-fidelity simulations and rendering tasks. Their applications span various industries, from autonomous vehicles using NVIDIA DRIVE to healthcare solutions with NVIDIA Clara, and AI-driven analytics and workflows.

📋 Description

• Build, maintain, and improve CI infrastructure that supports development, verification, and release of NVIDIA’s deep learning compiler stacks across GPU and accelerator environments • Improve CI reliability and signal quality by reducing flakes, improving reproducibility, strengthening diagnostics, and making correctness and performance failures easier to understand and act on • Apply automation, AI, and agent-based workflows to reduce manual CI operations, speed up failure triage, and improve developer efficiency • Build reusable and self-service CI platforms that support multiple products, projects, model suites, hardware targets, and software configurations while partnering closely with compiler, infrastructure, and release teams

🎯 Requirements

• BS, MS, or PhD (or equivalent experience) in Computer Science, Computer/Electrical Engineering, Mathematics, or a related field • 5+ years of experience designing, scaling, and operating CI/CD, build/release, or developer infrastructure for complex software systems • Proven experience building CI platforms end-to-end using systems such as GitLab CI, Jenkins, or similar tools, including pipeline orchestration, compute/runner management, artifact and package systems, and observability, with strong emphasis on reliability, reproducibility, and debuggability • Strong software engineering skills (Python required), with the ability to design, implement, and debug distributed systems end-to-end • Familiarity with edge devices (SOC, e.g. NVIDIA Tegra) in host-target architecture, ability to debug it and knowledge to its automation nuances • Proven track record of designing, building, and deploying AI/LLM-based systems in real engineering workflows, demonstrating skill in evaluating trade-offs, failure modes, maintainability, and measurable impact on developer productivity, signal quality, or operational efficiency.

🏖️ Benefits

• With competitive salaries and a generous benefits package

Apply Now