Senior Solutions Architect, HPC – AI

November 4

Apply Now
Logo of NVIDIA

NVIDIA

Artificial Intelligence • Gaming • Automotive

NVIDIA is a leading technology company specializing in accelerated computing and artificial intelligence. NVIDIA pioneers advancements in graphical processing units (GPUs), cloud computing, data centers, and virtual reality, with a focus on gaming, automotive, healthcare, and robotics industries. The company's innovations, such as NVIDIA Omniverse, transform traditional digital processes by enabling high-fidelity simulations and rendering tasks. Their applications span various industries, from autonomous vehicles using NVIDIA DRIVE to healthcare solutions with NVIDIA Clara, and AI-driven analytics and workflows.

10,000+ employees

Founded 1993

🤖 Artificial Intelligence

🎮 Gaming

📋 Description

• Collaborating with NVIDIA’s training framework developers and product teams to stay ahead of the latest features and help partners to adopt them effectively. • Assisting with deployment, debugging, and improving the efficiency of AI workloads on extensive NVIDIA platforms. • Benchmarking new framework features, analyzing performance, and sharing actionable insights with both customers and internal teams. • Working directly with external customers to solve cluster performance and stability issues, identify bottlenecks, and implement effective solutions. • Build expertise and guide customers in scaling workloads efficiently and reliably on the latest generation of NVIDIA GPUs. • Contributing to Europe’s Sovereign AI initiative by helping customers implement advanced resiliency features within AI training pipelines.

🎯 Requirements

• 8+ years of experience in accelerated computing technologies at cluster scale • Strong programming skills in at least one of the following languages: C, C++, or Python • Practical experience identifying and resolving bottlenecks in large-scale training workloads or parallel applications • Hands-on experienced in profiling and debugging large parallel applications • Solid understanding of CPU and GPU architectures, CUDA, parallel filesystems, and high-speed interconnects • Experienced in working with large compute clusters with an understanding of their internal scheduling and resource management mechanisms (e.g. SLURM or Cloud based clusters) • Proficient knowledge of training pipelines and frameworks, encompassing their internal operations and performance attributes.

🏖️ Benefits

• Health insurance • Flexible work arrangements • Professional development

Apply Now

Similar Jobs

November 4

Kainos

1001 - 5000

Kainos is hiring a Palantir Solution Architect (Manager) to manage teams operationalizing data solutions. Responsibilities include collaborating with architects and delivering robust technological designs.

AWS

Azure

Google Cloud Platform

Java

Python

Scala

October 31

SAP Solution Architect responsible for defining and delivering SAP solutions for complex ERP projects. Involves coordination of project teams and collaboration with major stakeholders.

🗣️🇮🇹 Italian Required

Cloud

ERP

October 25

Global Strategic Design and Solutions Engineer in Mechanical supporting innovative engineering for data centers. Leading mechanical design for global projects with a focus on performance and compliance.

October 24

Senior Software Engineer at Mozilla enhancing Firefox features for OS integration, collaborating within a geographically distributed team.

MacOS

October 23

Technical Solutions Engineer designing integrations between iVerify and third-party systems. Collaborating with enterprise customers to enhance mobile security solutions.

Python

Built by Lior Neu-ner. I'd love to hear your feedback — Get in touch via DM or support@remoterocketship.com