Principal Software Engineer – Large-Scale LLM Memory and Storage Systems

🕒 Dezembro 22, 2025

🗣️🇺🇸🇬🇧 Inglês obrigatório

Candidatar-se
Encontrar Vagas Remotas Similares

📊 Verifique sua pontuação de currículo para esta vaga

Melhore suas chances de conseguir uma entrevista verificando sua pontuação de currículo antes de se candidatar.

Logo of NVIDIA

NVIDIA

10.000+ funcionários

Fundada em 1993

🤖 Inteligência Artificial

🎮 Jogos

Artificial Intelligence • Gaming • Automotive

A NVIDIA é uma empresa de tecnologia líder, especializada em computação acelerada e inteligência artificial. A companhia é pioneira em avanços em unidades de processamento gráfico (GPUs), computação em nuvem, data centers e realidade virtual, com foco nos setores de games, automotivo, saúde e robótica. As inovações da empresa, como o NVIDIA Omniverse, transformam processos digitais tradicionais ao viabilizar simulações de alta fidelidade e tarefas de renderização. Suas aplicações abrangem diversos setores, desde veículos autônomos com o NVIDIA DRIVE até soluções de saúde com o NVIDIA Clara, além de análises e fluxos de trabalho impulsionados por IA.

Descrição

• Design and evolve a unified memory layer that spans GPU memory, pinned host memory, RDMA-accessible memory, SSD tiers, and remote file/object/cloud storage to support large-scale LLM inference • Architect and implement deep integrations with leading LLM serving engines (such as vLLM, SGLang, TensorRT-LLM), with a focus on KV-cache offload, reuse, and remote sharing across heterogeneous and disaggregated clusters • Co-design interfaces and protocols that enable disaggregated prefill, peer-to-peer KV-cache sharing, and multi-tier KV-cache storage (GPU, CPU, local disk, and remote memory) for high-throughput, low-latency inference • Partner closely with GPU architecture, networking, and platform teams to exploit GPUDirect, RDMA, NVLink, and similar technologies for low-latency KV-cache access and sharing across heterogeneous accelerators and memory pools • Mentor senior and junior engineers, set technical direction for memory and storage subsystems, and represent the team in internal reviews and external forums (open source, conferences, and customer-facing technical deep dives)

🎯 Requisitos

• Masters or PhD or equivalent experience • 15+ years of experience building large-scale distributed systems, high-performance storage, or ML systems infrastructure in C/C++ and Python, with a track record of delivering production services • Deep understanding of memory hierarchies (GPU HBM, host DRAM, SSD, and remote/object storage) and experience designing systems that span multiple tiers for performance and cost efficiency • Distributed caching or key-value systems, especially designs optimized for low latency and high concurrency • Hands-on experience with networked I/O and RDMA/NVMe-oF/NVLink-style technologies, and familiarity with concepts like disaggregated and aggregated deployments for AI clusters • Strong skills in profiling and optimizing systems across CPU, GPU, memory, and network, using metrics to drive architectural decisions and validate improvements in TTFT and throughput • Excellent communication skills and prior experience leading cross-functional efforts with research, product, and customer teams.

🏖️ Benefícios

• Equity • Benefits

Candidatar-se

Vagas Similares

🕒 Dezembro 22, 2025

May Mobility

51 - 200

🚗 Transporte

🤖 Inteligência Artificial

Director of Product Engineering at May Mobility overseeing product strategy and development for autonomous vehicles. Collaborating with cross-functional teams to enhance mobility solutions and drive innovation.

🇺🇸 Estados Unidos – Remoto (EUA)

💵 $160.000 - $230.000 / ano

⏰ Tempo Integral

🔴 Especialista

🧑‍💻 Engenheiro Full-stack

🦅 Patrocina Visto H1B

info

🗣️🇺🇸🇬🇧 Inglês obrigatório

🕒 Dezembro 20, 2025

Imply

51 - 200

Staff Software Engineer developing scalable web services and cloud infrastructure for a fast-growing startup. Working with the Platform Engineering team to implement solutions for data observability.

🇺🇸 Estados Unidos – Remoto (EUA)

💵 $195.000 - $230.000 / ano

⏰ Tempo Integral

🔴 Especialista

🧑‍💻 Engenheiro Full-stack

🦅 Patrocina Visto H1B

info

🗣️🇺🇸🇬🇧 Inglês obrigatório

🕒 Dezembro 19, 2025

TTEC Digital

1001 - 5000

🤖 Inteligência Artificial

🤝 B2B

Azure Principal Software Engineer focusing on client engagement and technical solution design for Azure solutions. Coaching clients to ensure employees feel valued and supported in delivering exceptional customer experiences.

🇺🇸 Estados Unidos – Remoto (EUA)

💵 $170.000 - $210.000 / ano

⏰ Tempo Integral

🔴 Especialista

🧑‍💻 Engenheiro Full-stack

🦅 Patrocina Visto H1B

info

🗣️🇺🇸🇬🇧 Inglês obrigatório

🕒 Dezembro 19, 2025

N3XT SPORTS

11 - 50

⚽ Esportes

🤝 B2B

Full Stack Engineer developing and maintaining web applications and APIs in a remote environment. Collaborating with teams and optimizing user experience.

🇺🇸 Estados Unidos – Remoto (EUA)

⏰ Tempo Integral

🟠 Sênior

🔴 Especialista

🧑‍💻 Engenheiro Full-stack

🗣️🇺🇸🇬🇧 Inglês obrigatório

🕒 Dezembro 18, 2025

Luxury Presence

201 - 500

🏠 Imobiliário

Staff Software Engineer for Luxury Presence's data platform enhancing real estate technologies. Focused on backend development, data pipelines, and AI-powered solutions.

🇺🇸 Estados Unidos – Remoto (EUA)

💵 $200.000 - $230.000 / ano

⏰ Tempo Integral

🔴 Especialista

🧑‍💻 Engenheiro Full-stack

🦅 Patrocina Visto H1B

info

🗣️🇺🇸🇬🇧 Inglês obrigatório