Senior Network Reliability Engineer – DGX Cloud

🕒 May 14

Apply Now
Find Similar Remote Jobs

📊 Check your resume score for this job

Improve your chances of getting an interview by checking your resume score before you apply.

Logo of NVIDIA

NVIDIA

10,000+ employees

Founded 1993

🤖 Artificial Intelligence

🎮 Gaming

Artificial Intelligence • Gaming • Automotive

NVIDIA is a leading technology company specializing in accelerated computing and artificial intelligence. NVIDIA pioneers advancements in graphical processing units (GPUs), cloud computing, data centers, and virtual reality, with a focus on gaming, automotive, healthcare, and robotics industries. The company's innovations, such as NVIDIA Omniverse, transform traditional digital processes by enabling high-fidelity simulations and rendering tasks. Their applications span various industries, from autonomous vehicles using NVIDIA DRIVE to healthcare solutions with NVIDIA Clara, and AI-driven analytics and workflows.

📋 Description

• Engage in 24/7 global shift rotations to provide remote support for network repairs and changes while collaborating across teams and updating customers on status and ticket information. • Drive operational improvements in change management and daily operations by following procedures. • Manage and operate large scale IP network technologies and infrastructures. • Utilize your skills in Peering and Datacenter interconnect technologies: PNI, Transit, Exchange, Passive DWDM, Wave circuits. • Monitor and support the network health of on-premises and cloud infrastructures. • Collaborate and develop workflow enhancements while documenting best practices.

🎯 Requirements

• Deep knowledge and experience of TCP/IP, BGP, OSPF, MPLS, IS-IS, VxLAN, EVPN, QoS, GRE, IPsec, DNS, and MACsec. • 5+ years of experience in network operations. • Skilled in network troubleshooting techniques and demonstrating creative problem-solving abilities. • Strong track record of alert response within defined SLAs and Incident management. • Experience with one or more of the following CSP environments: AWS, Azure, GCP, OCI. • Familiarity with Arista, Fortinet and Juniper. • Hands-on experience with contributing to tooling and automation for provisioning, monitoring, and managing complex network infrastructures. • Bachelor’s degree in Computer Science, related technical field, or equivalent experience. • Excellent verbal and written communication skills.

🏖️ Benefits

• equity • benefits

Apply Now

Similar Jobs

🕒 May 14

NetBox Labs

11 - 50

🤝 B2B

☁️ SaaS

🏢 Enterprise

Senior DevOps Engineer joining NetBox Labs Cloud Delivery team to enhance AWS infrastructure. Leading projects and mentorship within a fast-paced DevOps environment.

AWS

Cloud

Grafana

Kubernetes

Prometheus

Python

Shell Scripting

Terraform

Go

🕒 May 14

Launch Potato

51 - 200

📱 Media

👥 B2C

Lead Engineer overseeing Launch Potato's cloud infrastructure and SRE function. Evolving CI/CD platform, compliance posture, and leading AWS multi-account migration.

AWS

Cloud

Microservices

Terraform

🕒 May 14

Launch Potato

51 - 200

📱 Media

👥 B2C

Lead DevOps/SRE Engineer evolving cloud infrastructure at Launch Potato. Building an SRE function to enable faster shipping of products while maintaining reliability and cost control.

AWS

Cloud

Grafana

Microservices

Terraform

🕒 May 14

Launch Potato

51 - 200

📱 Media

👥 B2C

Lead SRE/DevOps Engineer at Launch Potato evolving cloud infrastructure and CI/CD platform. Owning SRE function development for faster product team performance without compromising reliability or security.

AWS

Cloud

Grafana

Microservices

Terraform

🕒 May 14

Quantiphi

1001 - 5000

🤖 Artificial Intelligence

🏢 Enterprise

📚 Education

Senior DevOps/Observability Engineer leading unified observability platform design for Fortune 500 clients. Focused on architecting observability pipeline using AWS and modern open-source tools.

AWS

Grafana

Kubernetes

Prometheus

Splunk

Terraform