Senior HPC Performance Engineer

August 13

Apply Now
Logo of NVIDIA

NVIDIA

Artificial Intelligence • Gaming • Automotive

NVIDIA is a leading technology company specializing in accelerated computing and artificial intelligence. NVIDIA pioneers advancements in graphical processing units (GPUs), cloud computing, data centers, and virtual reality, with a focus on gaming, automotive, healthcare, and robotics industries. The company's innovations, such as NVIDIA Omniverse, transform traditional digital processes by enabling high-fidelity simulations and rendering tasks. Their applications span various industries, from autonomous vehicles using NVIDIA DRIVE to healthcare solutions with NVIDIA Clara, and AI-driven analytics and workflows.

đź“‹ Description

• Conduct in-depth performance characterization and analysis on large multi-GPU and multi-node clusters. • Study the interaction of our libraries with all HW (GPU, CPU, Networking) and SW components in the stack • Evaluate proof-of-concepts, conduct trade-off analysis when multiple solutions are available • Triage and root-cause performance issues reported by our customers • Collect a lot of performance data; build tools and infrastructure to visualize and analyze the information • Collaborate with a very dynamic team across multiple time zones

🎯 Requirements

• M.S. (or equivalent experience) or PHD in Computer Science, or related field with relevant performance engineering and HPC experience • 3+ yrs of experience with parallel programming and at least one communication runtime (MPI, NCCL, UCX, NVSHMEM) • Experience conducting performance benchmarking and triage on large scale HPC clusters • Good understanding of computer system architecture, HW-SW interactions and operating systems principles (aka systems software fundamentals) • Implement micro-benchmarks in C/C++, read and modify the code base when required • Ability to debug performance issues across the entire HW/SW stack. • Proficient in a scripting language, preferably Python • Familiar with containers, cloud provisioning and scheduling tools (Kubernetes, SLURM, Ansible, Docker) • Adaptability and passion to learn new areas and tools. • Flexibility to work and communicate effectively across different teams and timezones

🏖️ Benefits

• highly competitive salaries • extensive benefits package • a work environment that promotes diversity, inclusion, and flexibility.

Apply Now

Similar Jobs

July 23

Join Belden as a Bid Engineer, impacting project assessments and cost estimations. Work remotely in Germany within a dynamic, global team.

Swift

July 1

Join an IT consulting firm as a Data Engineer to prepare data for analytics and operations.

🗣️🇩🇪 German Required

Python

SQL

Vault

June 16

Senior VMware Engineer needed to support innovative IT projects remotely, focusing on VCF.

🗣️🇩🇪 German Required

Ansible

Cloud

Linux

Terraform

VMware

May 13

Manage and develop cloud infrastructure with a focus on Microsoft Azure for an innovative company.

🗣️🇩🇪 German Required

Ansible

Azure

Cloud

Cyber Security

Firewalls

Terraform

VMware

April 28

FEST GmbH

51 - 200

FEST GmbH sucht Inbetriebnahme-Ingenieur zur Umsetzung verfahrenstechnischer Anlagen weltweit.

🗣️🇩🇪 German Required

Built by Lior Neu-ner. I'd love to hear your feedback — Get in touch via DM or support@remoterocketship.com