Network DevOps Engineer, RDMA Fabric Automation

Emploi pas sur LinkedIn

🕒 il y a 3 mois

🇺🇸 États-Unis – Télétravail

💵 $90 000 - $130 000 / an

⏰ Temps Plein

🟡 Intermédiaire

🟠 Senior

⛑ Ingénieur DevOps & SRE

🗣️🇺🇸🇬🇧 Anglais requis

Postuler Maintenant
Trouver des Emplois à Distance Similaires

📊 Vérifiez votre score de CV pour ce poste

Améliorez vos chances d'obtenir un entretien en vérifiant votre score de CV avant de postuler.

Logo of Vultr

Vultr

201 - 500 employés

Fondée en 2014

🤖 Intelligence artificielle

🤝 B2B

🔧 Matériel

🔥 Financement dans la dernière année

💰 €329 000 000 Debt Financing - Vultr en 2025-06

Artificial Intelligence • B2B • Hardware

Vultr est un fournisseur mondial d'infrastructures cloud offrant des machines virtuelles à la demande, des serveurs bare-metal, des instances accélérées par GPU, des bases de données gérées, un stockage d'objets et de blocs, des services Kubernetes et de mise en réseau. La plateforme met l'accent sur les charges de travail d'IA et de calcul haute performance (HPC) avec un large choix de GPUs AMD et NVIDIA, un réseau rapide, et plus de 32 régions de centres de données, ainsi qu'un marché d'applications déployables et des API conviviales pour les développeurs. Vultr cible les développeurs et les entreprises à la recherche d'alternatives cloud abordables, évolutives et conformes aux hyperscalers pour le calcul et le stockage.

Description

• Automate deployment and operations of large-scale RDMA (RoCEv2) Ethernet fabrics across Vultr data centers. • Build Ansible and Python-based frameworks to provision, validate, and remediate underlay and overlay networks. • Integrate network automation with Vultr’s source-of-truth systems (NetBox, OpsMill) for intent-driven configuration and validation. • Develop telemetry ingestion and correlation pipelines (gNMI, Prometheus, Kafka, custom collectors) for real-time network health and performance metrics. • Collaborate with platform, orchestration, and product engineering teams to optimize RDMA performance, PFC/ECN behavior, and path symmetry across fabrics. • Implement CI/CD workflows for network configuration changes — validation, pre-checks, and rollbacks. • Investigate complex network behaviors across layers — flow hashing, congestion domains, ECMP, and overlay interactions. • Contribute to the design of next-generation GPU and AI interconnect fabrics, ensuring seamless integration into Vultr’s global network architecture.

🎯 Exigences

• Solid understanding of modern data center networking: EVPN-VXLAN, BGP, MLAG, QoS, and traffic engineering. • Deep familiarity with RoCEv2, RDMA transport tuning, ECN/PFC, and lossless Ethernet design. • Strong experience with automation frameworks like Ansible, and languages like Python, Golang, Rust, or PHP • Comfort working with telemetry and monitoring stacks — Prometheus, Grafana, Loki, ELK, or similar. • Previous experience integrating with NetBox, Nautobot, OpsMill or similar for topology and configuration source-of-truth. • Familiarity with CI/CD systems (GitHub Actions, Jenkins, ArgoCD) for continuous delivery of network automation. • Strong Linux networking background, including namespaces, netlink, and system-level debugging.

🏖️ Avantages

• 100% company-paid insurance premiums for employee medical, dental and vision plans. • 401(k) plan that matches 100% up to 4%, with immediate vesting • Professional Development Reimbursement of $2,500 each year • 11 Holidays + Paid Time Off Accrual + Rollover Plan • Increased PTO at 3 year and 10 year anniversary + 1 month paid sabbatical every 5 years + Anniversary Bonus each year • $500 stipend for remote office setup in first year + $400 each following year • Internet reimbursement up to $75 per month • Gym membership reimbursement up to $50 per month • Company paid Wellable subscription

Postuler Maintenant

Emplois Similaires

🕒 il y a 3 mois

Tactibit Technologies

11 - 50

🔒 Cybersecurity

🏛️ Gouvernement

DevOps Engineer working at Tactibit Technologies to modernize legacy architectures for mission-critical systems. Collaborate with teams on cloud migrations and automating business processes.

🇺🇸 États-Unis – Télétravail

⏰ Temps Plein

🟡 Intermédiaire

🟠 Senior

⛑ Ingénieur DevOps & SRE

🗣️🇺🇸🇬🇧 Anglais requis

🕒 il y a 3 mois

DroneUp

51 - 200

🚀 Aérospatiale

☁️ SaaS

🤝 B2B

SRE - Platform Engineer at DroneUp focusing on IT infrastructure reliability and scalability. Driving SRE best practices within the team and collaborating on cloud engineering solutions.

🇺🇸 États-Unis – Télétravail

💵 $125 000 - $150 000 / an

💰 €241 201 Seed Round - DroneUp en 2022-07

⏰ Temps Plein

🟠 Senior

🔴 Expert

⛑ Ingénieur DevOps & SRE

🦅 Parrain de Visa H1B

info

🗣️🇺🇸🇬🇧 Anglais requis

🕒 il y a 3 mois

Filevine

201 - 500

☁️ SaaS

🤖 Intelligence artificielle

Senior DBRE managing performance and scalability of data platform at Filevine, a legal AI company. Focus on AI-driven automation, optimizing SQL Server and Postgres environments.

🇺🇸 États-Unis – Télétravail

💵 $145 000 - $180 000 / an

💰 €108 000 000 Series D en 2022-04

⏰ Temps Plein

🟠 Senior

⛑ Ingénieur DevOps & SRE

🦅 Parrain de Visa H1B

info

🗣️🇺🇸🇬🇧 Anglais requis

🕒 il y a 4 mois

Agile Defense

501 - 1000

🏛️ Gouvernement

🔒 Cybersecurity

DevSecOps Engineer building secure software delivery systems for national security missions. Seeking a builder with 3–5 years of relevant experience and a proactive approach to integration challenges.

🇺🇸 États-Unis – Télétravail

⏰ Temps Plein

🟡 Intermédiaire

🟠 Senior

⛑ Ingénieur DevOps & SRE

🗣️🇺🇸🇬🇧 Anglais requis

🕒 il y a 4 mois

Supabase

51 - 200

☁️ SaaS

🔌 API

🤖 Intelligence artificielle

Customer Reliability Engineer at Supabase working to ensure customer satisfaction and platform reliability. Collaborating with various teams to enhance the customer experience and resolve issues.

🇺🇸 États-Unis – Télétravail

💰 €80 000 000 Series B en 2022-05

⏰ Temps Plein

🟡 Intermédiaire

🟠 Senior

⛑ Ingénieur DevOps & SRE

🗣️🇺🇸🇬🇧 Anglais requis