Senior Solutions Architect, Infiniband, Networking, Ethernet

🔥 0 minutes ago

Apply Now
Find Similar Remote Jobs

📊 Check your resume score for this job

Improve your chances of getting an interview by checking your resume score before you apply.

Logo of NVIDIA

NVIDIA

10,000+ employees

Founded 1993

🤖 Artificial Intelligence

🎮 Gaming

Artificial Intelligence • Gaming • Automotive

NVIDIA is a leading technology company specializing in accelerated computing and artificial intelligence. NVIDIA pioneers advancements in graphical processing units (GPUs), cloud computing, data centers, and virtual reality, with a focus on gaming, automotive, healthcare, and robotics industries. The company's innovations, such as NVIDIA Omniverse, transform traditional digital processes by enabling high-fidelity simulations and rendering tasks. Their applications span various industries, from autonomous vehicles using NVIDIA DRIVE to healthcare solutions with NVIDIA Clara, and AI-driven analytics and workflows.

📋 Description

• Primary responsibilities will include building AI/HPC infrastructure for new and existing customers. • Support operational and reliability aspects of large-scale AI clusters, focusing on performance at scale, real-time monitoring, logging, and alerting. • Engage in and improve the whole lifecycle of services—from inception and design through deployment, operation, and refinement. • Maintain services once they are live by measuring and monitoring availability, latency, and overall system health. • Provide feedback to internal teams such as opening bugs, documenting workarounds, and suggesting improvements.

🎯 Requirements

• BS/MS/PhD or equivalent experience in Computer Science, Electrical/Computer Engineering, Physics, Mathematics, or related fields. • At least 8 years of professional experience in networking fundamentals, TCP/IP stack, and data center architecture. • Proficiency in configuring, testing, validating, and resolving issues in LAN and InfiniBand networks, especially in medium to large-scale HPC/AI environments. • Advanced knowledge of EVPN, BGP, OSPF, VXLAN protocols. • Hands-on experience with network switch/router platforms like Cumulus Linux, SONiC, IOS, JunosOS, and EOS. • Extensive experience delivering automated network provisioning solutions using tools like Ansible, Salt, and Python. • Ability to develop CI/CD pipelines for network operations. • Strong focus on customer needs and satisfaction. • Self-motivated with leadership skills to work collaboratively with customers and internal teams. • Strong written, verbal, and listening skills in English are essential.

Apply Now

Similar Jobs

🕒 5 days ago

Intetics

501 - 1000

🤖 Artificial Intelligence

🏢 Enterprise

SAP IS-U Solution Architect designing and implementing application solutions at Intetics Inc. Collaborating with stakeholders to fulfill business requirements and digital strategy.

🕒 5 days ago

Akamai Technologies

5001 - 10000

🔒 Cybersecurity

Senior Solutions Architect at Akamai, focusing on API Security solutions for enterprise clients. Involves hands-on deployment and technical architecture responsibilities.

AWS

Azure

Citrix

Cloud

DNS

Docker

Google Cloud Platform

Kubernetes

Linux

NGINX

Oracle

TCP/IP

VMware

🕒 May 27

Dropbox

1001 - 5000

🏢 Enterprise

⚡ Productivity

Systems Automation & Integration Engineer designing and supporting real-time integrations using platforms at Dropbox. Responsible for automation and data flow across ERP, CRM, and HCM systems.

Groovy

Java

JavaScript

Oracle

ServiceNow

SOAP

SQL

🕒 May 25

DoiT International

201 - 500

☁️ SaaS

Solutions Engineer optimizing Snowflake/Databricks for DoiT's customers in EMEA. Collaborating with Account Executives and delivering compelling technical presentations.

BigQuery

Cloud

🕒 May 22

Sigma Software Group

1001 - 5000

🎮 Gaming

📡 Telecommunications

AI Solution Architect at Sigma Software responsible for scalable technical solutions. Focused on AI-powered voice and conversational solutions with cloud-native integration.

AWS

Azure

Cloud

Docker

Google Cloud Platform

Kubernetes

Microservices

Python

PyTorch

Tensorflow