Senior ML Platform Engineer

Ähnliche Remote-Jobs finden

10.000+ Mitarbeiter

Gegründet 1993

🤖 Künstliche Intelligenz

🎮 Gaming

Artificial Intelligence • Gaming • Automotive

NVIDIA ist ein führendes Technologieunternehmen mit Spezialisierung auf beschleunigtes Computing und Künstliche Intelligenz (AI). NVIDIA treibt Fortschritte bei Grafikprozessoren (GPUs), Cloud Computing, Rechenzentren und Virtual Reality voran und fokussiert dabei Branchen wie Gaming, Automotive, Gesundheitswesen und Robotik. Innovationen des Unternehmens wie NVIDIA Omniverse transformieren traditionelle digitale Prozesse, indem sie hochrealistische Simulationen und Rendering-Aufgaben ermöglichen. Die Anwendungen erstrecken sich über zahlreiche Branchen – von autonomen Fahrzeugen mit NVIDIA DRIVE über Gesundheitslösungen mit NVIDIA Clara bis hin zu AI-gestützten Analysen und Workflows.

Senior ML Platform Engineer

🕒 vor 9 Tagen

🏄 California, Colorado, +2 weitere Bundesländer – Remote

💵 $152.000 - $241.500 / Jahr

⏰ Vollzeit

🟠 Senior

🏗️ Plattformingenieur

🦅 H1B-Visum-Sponsor

🗣️🇺🇸🇬🇧 Englisch erforderlich

Ansible

Cloud

Docker

Kubernetes

Linux

Python

Terraform

Jetzt Bewerben

📊 Überprüfen Sie Ihre Lebenslauf-Bewertung für diese Stelle

Verbessern Sie Ihre Chancen auf ein Vorstellungsgespräch, indem Sie Ihre Lebenslauf-Bewertung vor der Bewerbung überprüfen.

NVIDIA

10.000+ Mitarbeiter

Gegründet 1993

🤖 Künstliche Intelligenz

🎮 Gaming

Artificial Intelligence • Gaming • Automotive

Beschreibung

• Design, build, and maintain our core ML platform infrastructure as code, primarily using Ansible and Terraform, ensuring reproducibility and scalability across large-scale, distributed GPU clusters. • Apply SRE principles to diagnose, troubleshoot, and resolve complex system issues across the entire stack, ensuring high availability and performance for critical AI workloads. • Develop robust internal automation and tooling for ML workflow orchestration, resource scheduling, and platform operations, with a strong focus on software engineering best practices. • Collaborate with ML researchers and applied scientists to understand infrastructure needs and build solutions that streamline their end-to-end experimentation. • Evolve and operate our multi-cloud and hybrid (on-prem + cloud) environments, implementing monitoring, alerting, and incident response protocols. • Participate in on-call rotation to provide support for platform services and infrastructure running critical ML jobs, driving root cause analysis and implementing preventative measures. • Write high-quality, maintainable code (Python, Go) to contribute to the core orchestration platform and automate manual processes. • Drive the adoption of modern GPU technologies and ensure smooth integration of next-generation hardware into ML pipelines (e.g., GB200, NVLink, etc.).

🎯 Anforderungen

• BS/MS in Computer Science, Engineering, or equivalent experience. • 5+ years in software/platform engineering or SRE roles, including 3+ years focused on ML infrastructure or distributed compute systems. • Strong proficiency in Infrastructure-as-Code (IaC) tools, specifically Ansible and Terraform, with a proven track record of building and managing production infrastructure. • SRE-oriented mindset with extensive experience in diagnosing system-level issues, performance tuning, and ensuring platform reliability. • Solid understanding of ML workflows and lifecycle—from data preprocessing to deployment. • Proficiency in operating containerized workloads with Kubernetes and Docker. • Strong software engineering skills in languages such as Python or Go, with a focus on automation, tooling, and writing production-grade code. • Experience with Linux systems internals, networking, and performance tuning at scale.

🏖️ Vorteile

• equity • benefits

Jetzt Bewerben

Ähnliche Jobs

Lead Data Platform Engineer

🕒 vor 9 Tagen

CentraState Healthcare System

1001 - 5000

Lead Data Platform Engineer handling the technical architecture for the Enterprise Data Analytics Platform team. Driving large-scale engineering initiatives across the organization while mentoring engineers.

🇺🇸 Vereinigte Staaten – Remote

⏰ Vollzeit

🟠 Senior

🏗️ Plattformingenieur

🗣️🇺🇸🇬🇧 Englisch erforderlich

Amazon Redshift

Apache

BigQuery

Cloud

Distributed Systems

Java

Kafka

Python

Scala

Spark

SQL

Senior Platform Engineer

🕒 vor 9 Tagen

Bridgeway Benefit Technologies

201 - 500

☁️ SaaS

👥 HR Tech

Senior Platform Engineer focused on architecting and maintaining Bridgeway's cloud infrastructure. Driving DevOps practices and delivering efficient platform solutions across teams.

🇺🇸 Vereinigte Staaten – Remote

⏰ Vollzeit

🟠 Senior

🏗️ Plattformingenieur

🗣️🇺🇸🇬🇧 Englisch erforderlich

AWS

Azure

Cloud

Docker

Firewalls

Python

SDLC

Terraform

Senior Platform Engineer

🕒 vor 9 Tagen

NeoBIM GmbH

1 - 10

🤖 Künstliche Intelligenz

🏠 Immobilien

Senior Platform Engineer at neoBIM transforming the construction industry with AI-powered BIM solutions. Focused on infrastructure, system reliability, and CI/CD workflows in a collaborative environment.

🇺🇸 Vereinigte Staaten – Remote

⏰ Vollzeit

🟠 Senior

🏗️ Plattformingenieur

🗣️🇺🇸🇬🇧 Englisch erforderlich

AWS

Azure

Cloud

DynamoDB

Google Cloud Platform

Grafana

Linux

MongoDB

MySQL

Postgres

Prometheus

Shell Scripting

Terraform

Senior Systems & Platform Engineer, Enterprise Apps

🕒 vor 10 Tagen

MANSCAPED

201 - 500

💄 Schönheit

👥 B2C

🛍️ eCommerce

Senior Systems & Platform Engineer at MANSCAPED shaping Azure-based platform architecture and enterprise application integrations. Collaborating on cloud strategy and driving critical engineering initiatives.

🇺🇸 Vereinigte Staaten – Remote

💵 $167.000 - $177.000 / Jahr

⏰ Vollzeit

🟠 Senior

🏗️ Plattformingenieur

🗣️🇺🇸🇬🇧 Englisch erforderlich

Azure

Cloud

Python

Terraform

TypeScript

Senior Platform Engineer

🕒 vor 10 Tagen

Strivacity

11 - 50

🔌 API

🔒 Cybersecurity

💳 Fintech

Platform Engineer building and maintaining infrastructure for engineering teams at Strivacity. Focusing on Kubernetes, automation, and operational excellence in a remote role.

🇺🇸 Vereinigte Staaten – Remote

⏰ Vollzeit

🟠 Senior

🏗️ Plattformingenieur

🗣️🇺🇸🇬🇧 Englisch erforderlich

AWS

Flux

Grafana

Kubernetes

Prometheus

Python

Terraform