NVIDIA

Website LinkedIn All Job Openings

Artificial Intelligence • Gaming • Automotive

NVIDIA is a leading technology company specializing in accelerated computing and artificial intelligence. NVIDIA pioneers advancements in graphical processing units (GPUs), cloud computing, data centers, and virtual reality, with a focus on gaming, automotive, healthcare, and robotics industries. The company's innovations, such as NVIDIA Omniverse, transform traditional digital processes by enabling high-fidelity simulations and rendering tasks. Their applications span various industries, from autonomous vehicles using NVIDIA DRIVE to healthcare solutions with NVIDIA Clara, and AI-driven analytics and workflows.

10,000+ employees

Founded 1993

🤖 Artificial Intelligence

🎮 Gaming

Senior Deep Learning Software Engineer, Inference

September 17

🇳🇱 Netherlands – Remote

⏰ Full Time

🟠 Senior

🧑‍💻 Full-stack Engineer

Python

PyTorch

Apply Now

NVIDIA

Website LinkedIn All Job Openings

Artificial Intelligence • Gaming • Automotive

10,000+ employees

Founded 1993

🤖 Artificial Intelligence

🎮 Gaming

📋 Description

• Performance optimization, analysis, and tuning of DL models in various domains like LLM, Multimodal and Generative AI • Scale performance of DL models across different architectures and types of NVIDIA accelerators • Contribute features and code to NVIDIA’s inference libraries, vLLM and SGLang, FlashInfer and LLM software solutions • Work with cross-collaborative teams across frameworks, NVIDIA libraries and inference optimization innovative solutions • Implement and optimize model serving pipelines using open-source tools and plugins including CUTLASS, OAI Triton, NCCL, and CUDA kernels

🎯 Requirements

• Masters or PhD or equivalent experience in relevant field (Computer Engineering, Computer Science, EECS, AI) • 5+ years of relevant software development experience • Excellent C/C++ programming and software design skills • SW Agile skills are helpful and Python experience is a plus • Prior experience with training, deploying or optimizing the inference of DL models in production is a plus • Prior background with performance modeling, profiling, debug, and code optimization or architectural knowledge of CPU and GPU is a plus • Experience with Multi-GPU Communications (NCCL, NVSHMEM) is a plus • Experience building and shipping products to enterprise customers is a plus • GPU programming experience (CUDA, OAI TRITON or CUTLASS)