Principal Deep Learning Communication Architect

🕒 April 14

Apply Now
Find Similar Remote Jobs

📊 Check your resume score for this job

Improve your chances of getting an interview by checking your resume score before you apply.

Logo of NVIDIA

NVIDIA

10,000+ employees

Founded 1993

🤖 Artificial Intelligence

🎮 Gaming

Artificial Intelligence • Gaming • Automotive

NVIDIA is a leading technology company specializing in accelerated computing and artificial intelligence. NVIDIA pioneers advancements in graphical processing units (GPUs), cloud computing, data centers, and virtual reality, with a focus on gaming, automotive, healthcare, and robotics industries. The company's innovations, such as NVIDIA Omniverse, transform traditional digital processes by enabling high-fidelity simulations and rendering tasks. Their applications span various industries, from autonomous vehicles using NVIDIA DRIVE to healthcare solutions with NVIDIA Clara, and AI-driven analytics and workflows.

📋 Description

• Define the long-term technical roadmap for communication libraries across NVIDIA’s next-generation platforms • Lead the development of next-generation communication primitives and collective algorithms • Partner with application developers to architect and implement specialized communication primitives • Collaborate with silicon architects and software engineers to influence hardware specifications for next-generation networking • Develop high-fidelity analytical models and simulators to predict system behavior under emerging workloads

🎯 Requirements

• Ph.D. or M.S. in Computer Science, Electrical Engineering, or a related field (or equivalent experience) • 12+ years of industry experience in high-performance computing (HPC) or distributed deep learning • Deep understanding of 3D parallelism (Data, Tensor, Pipeline) and advanced strategies including Context Parallelism, Expert Parallelism, and Zero Redundancy Optimizer (ZeRO) variants • Deep technical proficiency with NCCL, UCX, UCC, NVSHMEM, or MPI • Experience with RDMA, RoCE, and low-level InfiniBand verbs • Advanced knowledge of high-throughput inference engines and schedulers, specifically TensorRT-LLM, vLLM, SGLang, and NVIDIA Dynamo • Expert knowledge of the NVIDIA GPU memory hierarchy (HBM3e/HBM4, L2 cache)

🏖️ Benefits

• equity • benefits

Apply Now

Similar Jobs

🕒 April 14

HED

201 - 500

📚 Education

Project Architect leading site and master planning development for innovative data center projects. Collaborating with multi-disciplinary teams and overseeing project deliverables in architecture and engineering.

🕒 April 14

Presidio

1001 - 5000

🤝 B2B

🤖 Artificial Intelligence

🔒 Cybersecurity

Leading presales in digital services for Healthcare and Life Sciences. Collaborating with sales teams to understand client needs and drive innovative solutions.

🕒 April 14

Presidio

1001 - 5000

🤖 Artificial Intelligence

🔒 Cybersecurity

🏢 Enterprise

Principal Architect at Presidio leading digital services focusing on Healthcare and Life Sciences. Responsible for technical sales strategy, client engagement, and solution proposal development.

🕒 April 10

K2M Design

51 - 200

Principal in Charge at K2M Design Inc. managing client relationships and leading business growth strategies. Fostering long-term partnerships and ensuring project delivery meets standards.

🕒 April 10

K2M Design

51 - 200

Principal at K2M Design leading business development and client relationships. Ensuring project success and quality while driving business growth.