Senior Software Engineer – NVLink Rack Scale Stability and Reliability

🕒 Maio 22

🗣️🇺🇸🇬🇧 Inglês obrigatório

Distributed Systems

Python

Shell Scripting

Switching

TCP/IP

Candidatar-se
Encontrar Vagas Remotas Similares

📊 Verifique sua pontuação de currículo para esta vaga

Melhore suas chances de conseguir uma entrevista verificando sua pontuação de currículo antes de se candidatar.

Logo of NVIDIA

NVIDIA

10.000+ funcionários

Fundada em 1993

🤖 Inteligência Artificial

🎮 Jogos

Artificial Intelligence • Gaming • Automotive

A NVIDIA é uma empresa de tecnologia líder, especializada em computação acelerada e inteligência artificial. A companhia é pioneira em avanços em unidades de processamento gráfico (GPUs), computação em nuvem, data centers e realidade virtual, com foco nos setores de games, automotivo, saúde e robótica. As inovações da empresa, como o NVIDIA Omniverse, transformam processos digitais tradicionais ao viabilizar simulações de alta fidelidade e tarefas de renderização. Suas aplicações abrangem diversos setores, desde veículos autônomos com o NVIDIA DRIVE até soluções de saúde com o NVIDIA Clara, além de análises e fluxos de trabalho impulsionados por IA.

Descrição

• Drive platform bringup, feature enablement, end-to-end software validation, and debug for next-generation NVLink-based GPU and rack-scale systems. • Develop tools, diagnostics, automation, and infrastructure for system validation, regression testing, and fleet support. • Lead reliability and MTBI validation through stress testing, telemetry analysis, failure injection, and issue resolution. • Triage complex software, firmware, networking, and platform issues across validation, deployment, and production environments. • Collaborate with architecture, hardware, firmware, software, and Customer engagement teams to improve system quality and reliability. • Build and maintain SRE-style validation infrastructure, including provisioning, monitoring, and operational readiness. • Create automation, dashboards, runbooks, and debug workflows that improve root-cause analysis and operational efficiency.

🎯 Requisitos

• BS or MS in Computer Science, Computer Engineering, Electrical Engineering, or related field, or equivalent experience. • 5+ years of experience in system software, firmware, networking, platform enablement, data center infrastructure, or distributed systems. • Strong programming skills in C/C++ and Python; Bash/Shell scripting experience is a plus. • Strong system-level debugging across software, firmware, hardware, and networking layers. • Solid networking fundamentals, including TCP/IP, Ethernet and/or InfiniBand, RDMA/RoCE, routing, switching, and fabric performance analysis. • Experience with large-scale AI systems, including platform bringup, validation, reliability engineering, stress testing, telemetry analysis, and root-cause debugging. • Ability to triage complex multi-domain issues using logs, telemetry, experiments, and structured debugging methods. • Strong communication and collaboration skills across engineering, customer, and operations teams. • Passion for building reliable next-generation AI infrastructure and solving complex system-level challenges at scale.

🏖️ Benefícios

• Eligible for equity and benefits

Candidatar-se

Vagas Similares

🕒 Maio 22

Inovalon

1001 - 5000

🤖 Inteligência Artificial

Senior Software Development Engineer developing healthcare software solutions with .NET and Angular. Collaborating on AI integration and cloud migration for clinical workflow improvements.

🗣️🇺🇸🇬🇧 Inglês obrigatório

🕒 Maio 22

NextLink Labs

11 - 50

🤝 B2B

🏢 Corporativo

🔒 Cibersegurança

Software Architect at NextLink Labs building and maintaining web applications using Ruby on Rails. Collaborating with clients and mentoring engineers in a remote-first culture.

🇺🇸 Estados Unidos – Remoto (EUA)

⏰ Tempo Integral

🟡 Pleno

🟠 Sênior

🧑‍💻 Engenheiro Full-stack

🗣️🇺🇸🇬🇧 Inglês obrigatório

🕒 Maio 22

ZoomInfo

1001 - 5000

🤝 B2B

☁️ SaaS

🏢 Corporativo

Senior Full Stack Engineer on ZoomInfo's Conversation Intelligence team designing and delivering features across the full stack. Collaborating with engineers and product managers to enhance customer interaction tools.

🇺🇸 Estados Unidos – Remoto (EUA)

💰 Private Equity Round em 2014-04

⏰ Tempo Integral

🟠 Sênior

🧑‍💻 Engenheiro Full-stack

🦅 Patrocina Visto H1B

info

🗣️🇺🇸🇬🇧 Inglês obrigatório

🕒 Maio 22

Sophia

1 - 10

🤝 B2B

📚 Educação

🧘 Bem-estar

Senior Full Stack Engineer designing, building, and supporting the Sophia learning platform's applications. Collaborating with product and business stakeholders, mentoring engineers, and integrating AI capabilities.

🇺🇸 Estados Unidos – Remoto (EUA)

💵 $103.900 - $155.900 / ano

💰 Funding Round em 2011-02

⏰ Tempo Integral

🟠 Sênior

🧑‍💻 Engenheiro Full-stack

🦅 Patrocina Visto H1B

info

🗣️🇺🇸🇬🇧 Inglês obrigatório

🕒 Maio 22

Strategic Education, Inc

5001 - 10000

📚 Educação

🤝 B2B

🏢 Corporativo

Senior Full Stack Engineer designing, building, and supporting applications for the Sophia learning platform. Collaborating with stakeholders to deliver scalable and reliable solutions.

🇺🇸 Estados Unidos – Remoto (EUA)

💵 $103.900 - $155.900 / ano

⏰ Tempo Integral

🟠 Sênior

🧑‍💻 Engenheiro Full-stack

🗣️🇺🇸🇬🇧 Inglês obrigatório