Senior Software Engineer – NVLink Rack Scale Stability and Reliability

🕒 vor 18 Tagen

🗣️🇺🇸🇬🇧 Englisch erforderlich

Distributed Systems

Python

Shell Scripting

Switching

TCP/IP

Jetzt Bewerben
Ähnliche Remote-Jobs finden

📊 Überprüfen Sie Ihre Lebenslauf-Bewertung für diese Stelle

Verbessern Sie Ihre Chancen auf ein Vorstellungsgespräch, indem Sie Ihre Lebenslauf-Bewertung vor der Bewerbung überprüfen.

Logo of NVIDIA

NVIDIA

10.000+ Mitarbeiter

Gegründet 1993

🤖 Künstliche Intelligenz

🎮 Gaming

Artificial Intelligence • Gaming • Automotive

NVIDIA ist ein führendes Technologieunternehmen mit Spezialisierung auf beschleunigtes Computing und Künstliche Intelligenz (AI). NVIDIA treibt Fortschritte bei Grafikprozessoren (GPUs), Cloud Computing, Rechenzentren und Virtual Reality voran und fokussiert dabei Branchen wie Gaming, Automotive, Gesundheitswesen und Robotik. Innovationen des Unternehmens wie NVIDIA Omniverse transformieren traditionelle digitale Prozesse, indem sie hochrealistische Simulationen und Rendering-Aufgaben ermöglichen. Die Anwendungen erstrecken sich über zahlreiche Branchen – von autonomen Fahrzeugen mit NVIDIA DRIVE über Gesundheitslösungen mit NVIDIA Clara bis hin zu AI-gestützten Analysen und Workflows.

Beschreibung

• Drive platform bringup, feature enablement, end-to-end software validation, and debug for next-generation NVLink-based GPU and rack-scale systems. • Develop tools, diagnostics, automation, and infrastructure for system validation, regression testing, and fleet support. • Lead reliability and MTBI validation through stress testing, telemetry analysis, failure injection, and issue resolution. • Triage complex software, firmware, networking, and platform issues across validation, deployment, and production environments. • Collaborate with architecture, hardware, firmware, software, and Customer engagement teams to improve system quality and reliability. • Build and maintain SRE-style validation infrastructure, including provisioning, monitoring, and operational readiness. • Create automation, dashboards, runbooks, and debug workflows that improve root-cause analysis and operational efficiency.

🎯 Anforderungen

• BS or MS in Computer Science, Computer Engineering, Electrical Engineering, or related field, or equivalent experience. • 5+ years of experience in system software, firmware, networking, platform enablement, data center infrastructure, or distributed systems. • Strong programming skills in C/C++ and Python; Bash/Shell scripting experience is a plus. • Strong system-level debugging across software, firmware, hardware, and networking layers. • Solid networking fundamentals, including TCP/IP, Ethernet and/or InfiniBand, RDMA/RoCE, routing, switching, and fabric performance analysis. • Experience with large-scale AI systems, including platform bringup, validation, reliability engineering, stress testing, telemetry analysis, and root-cause debugging. • Ability to triage complex multi-domain issues using logs, telemetry, experiments, and structured debugging methods. • Strong communication and collaboration skills across engineering, customer, and operations teams. • Passion for building reliable next-generation AI infrastructure and solving complex system-level challenges at scale.

🏖️ Vorteile

• Eligible for equity and benefits

Jetzt Bewerben

Ähnliche Jobs

🕒 vor 18 Tagen

Inovalon

1001 - 5000

🤖 Künstliche Intelligenz

Senior Software Development Engineer developing healthcare software solutions with .NET and Angular. Collaborating on AI integration and cloud migration for clinical workflow improvements.

🗣️🇺🇸🇬🇧 Englisch erforderlich

🕒 vor 18 Tagen

NextLink Labs

11 - 50

🤝 B2B

🏢 Unternehmen

🔒 Cybersecurity

Software Architect at NextLink Labs building and maintaining web applications using Ruby on Rails. Collaborating with clients and mentoring engineers in a remote-first culture.

🇺🇸 Vereinigte Staaten – Remote

⏰ Vollzeit

🟡 Mittelstufe

🟠 Senior

🧑‍💻 Full-Stack-Entwickler

🗣️🇺🇸🇬🇧 Englisch erforderlich

🕒 vor 18 Tagen

ZoomInfo

1001 - 5000

🤝 B2B

☁️ SaaS

🏢 Unternehmen

Senior Full Stack Engineer on ZoomInfo's Conversation Intelligence team designing and delivering features across the full stack. Collaborating with engineers and product managers to enhance customer interaction tools.

🗣️🇺🇸🇬🇧 Englisch erforderlich

🕒 vor 18 Tagen

Sophia

1 - 10

🤝 B2B

📚 Bildung

🧘 Wellness

Senior Full Stack Engineer designing, building, and supporting the Sophia learning platform's applications. Collaborating with product and business stakeholders, mentoring engineers, and integrating AI capabilities.

🇺🇸 Vereinigte Staaten – Remote

💵 $103.900 - $155.900 / Jahr

💰 Funding Round im 2011-02

⏰ Vollzeit

🟠 Senior

🧑‍💻 Full-Stack-Entwickler

🦅 H1B-Visum-Sponsor

info

🗣️🇺🇸🇬🇧 Englisch erforderlich

🕒 vor 18 Tagen

Strategic Education, Inc

5001 - 10000

📚 Bildung

🤝 B2B

🏢 Unternehmen

Senior Full Stack Engineer designing, building, and supporting applications for the Sophia learning platform. Collaborating with stakeholders to deliver scalable and reliable solutions.

🇺🇸 Vereinigte Staaten – Remote

💵 $103.900 - $155.900 / Jahr

⏰ Vollzeit

🟠 Senior

🧑‍💻 Full-Stack-Entwickler

🗣️🇺🇸🇬🇧 Englisch erforderlich