MLOps Platform Engineer

🕒 vor 4 Monaten

🇺🇸 Vereinigte Staaten – Remote

💵 $185.000 - $200.000 / Jahr

⏰ Vollzeit

🟠 Senior

🔴 Experte

🏗️ Plattformingenieur

🦅 H1B-Visum-Sponsor

info

🗣️🇺🇸🇬🇧 Englisch erforderlich

Jetzt Bewerben
Ähnliche Remote-Jobs finden

📊 Überprüfen Sie Ihre Lebenslauf-Bewertung für diese Stelle

Verbessern Sie Ihre Chancen auf ein Vorstellungsgespräch, indem Sie Ihre Lebenslauf-Bewertung vor der Bewerbung überprüfen.

Logo of dv01

dv01

51 - 200 Mitarbeiter

Gegründet 2014

💸 Finanzen

💳 Fintech

☁️ SaaS

Finance • Fintech • SaaS

dv01 ist eine Datenmanagement- und Analyseplattform, die als entscheidendes Bindeglied zwischen Kreditgebern und Kapitalmärkten fungiert. Sie ist auf die Bereitstellung standardisierter Darlehensdaten auf Einzelkreditebene und integrierter Analysetools spezialisiert, um den Zugang zu strukturierten Finanzprodukten zu erleichtern und deren Analyse zu ermöglichen. Die Plattform unterstützt verschiedene Anlageklassen, darunter unverbesicherte Konsumentenkredite, Hypotheken, Autokredite und Studienkredite, und bietet Einblicke durch Funktionen wie ESG-Datenanreicherung und Portfolio-Überwachung. Die Angebote von dv01 helfen dabei, rohe, fehlerbehaftete Daten in vertrauenswürdige, umsetzbare Erkenntnisse zu verwandeln, die Investmentbanken, Hedgefonds, Vermögensverwaltern und institutionellen Investoren helfen, intelligentere, datenbasierte Finanzentscheidungen zu treffen. Das Unternehmen tritt den veralteten Technologien im Markt für strukturierte Produkte entgegen und bietet eine moderne, cloudbasierte Lösung zur Verbesserung der Datenintegrität und Optimierung finanzieller Strategien an.

Beschreibung

• Build and operate an AI infrastructure platform: You will design, build, and operate cloud-native infrastructure and platform tooling that accelerates AI development across the company. This includes enabling teams to develop, deploy, and operate AI-powered services safely and efficiently in production environments. • Own the DevOps and infrastructure side of MLOps and Agentic Systems: You will focus on the operational foundations of AI systems, including CI/CD for AI workloads, scalable inference infrastructure, observability, cost management, and reliability. You will establish repeatable patterns and shared services that reduce friction for teams building AI-enabled applications. • Enable AI services, agents, and runtime platforms: You will build and maintain infrastructure to support AI services such as LLM-backed APIs, Model Context Protocol (MCP) servers, and agentic systems used by production applications. You will enable secure tool access, runtime orchestration, and isolation boundaries for AI-driven workloads. • Integrate MLOps capabilities into platform operations: You will apply MLOps concepts to improve platform operations, including using AI-driven approaches for monitoring, alerting, anomaly detection, and incident response across AI and non-AI systems. You will help evolve how the platform observes and operates complex AI-enabled systems at scale. • Establish governance, security, and operational guardrails: You will help define and implement infrastructure-level governance for AI systems, including access controls, deployment policies, auditability, and secure-by-default patterns. You will partner with security and compliance teams to ensure AI infrastructure aligns with organizational risk and regulatory requirements. • Provide technical leadership and enablement: You will act as a technical leader, influencing platform architecture and best practices across teams. You will mentor engineers and work closely with product, data, and application teams to align AI platform capabilities with business goals.

🎯 Anforderungen

• A senior cloud and platform engineer: You have 8+ years of experience in cloud infrastructure, DevOps, or platform engineering roles, with deep expertise designing and operating distributed systems in production. • Experienced with MLOps and agentic platforms: You have direct exposure to ML/GenAIOps practices, such as monitoring, anomaly detection, predictive alerting, or automated remediation, applied to real production systems. 5+ years of MLOps experience is required. • Strong in cloud-native infrastructure: You are proficient in building and managing cloud environments, Kubernetes, containerized workloads and infrastructure-as-code tools such as Terraform. • Comfortable supporting AI workloads: You have hands-on experience supporting platforms that and host/run deep neural networks, including LLM runtimes (e.g., vLLM, llama.cpp), ML compiler stacks (e.g., LLVM/MLIR), and PyTorch-based production systems. • Security- and operations-minded: You have a strong understanding of infrastructure security, IAM, secrets management, and operational risk as it relates to AI-enabled systems. • A platform-focused technical leader: You operate effectively as a technical leader, influencing architecture and standards while remaining hands-on. You communicate clearly, collaborate well cross-functionally, and thrive in ambiguous problem spaces. • Forward-thinking and pragmatic: You are proactive and innovative, with the ability to introduce emerging agentic patterns while balancing operational maturity and long-term maintainability. You will help design and operate scalable benchmarking and evaluation frameworks for agentic AI systems, enabling quantitative measurement of accuracy, reliability, cost–performance tradeoffs, regression detection, and the impact of model, prompt, or architecture changes (including techniques such as LLM-as-a-judge), with tooling that is reusable and accessible across the organization. • Nice To Have: Experience with Pulumi, Experience with GCP, and Cloudflare, Experience with GHA and Harness, Experience with Go lang, Experiencing supporting Data Engineering Platforms, Exposure to Data Warehousing and ETL/ELT Tools or Operations.

🏖️ Vorteile

• Unlimited PTO. Unplug and rejuvenate, however you want—whether that’s vacationing on the beach or at home on a mental-health day. • $1,000 Learning & Development Fund. No matter where you are in your career, always invest in your future. We encourage you to attend conferences, take classes, and lead workshops. We also host hackathons, brunch & learns, and other employee-led learning opportunities. • Remote-First Environment. People thrive in a flexible and supportive environment that best invigorates them. You can work from your home, cafe, or hotel. You decide. • Health Care and Financial Planning. We offer a comprehensive medical, dental, and vision insurance package for you and your family. We also offer a 401(k) for you to contribute. • Stay active your way! Get $138/month to put toward your favorite gym or fitness membership — wherever you like to work out. Prefer to exercise at home? You can also use up to $1,650 per year through our Fitness Fund to purchase workout equipment, gear, or other wellness essentials. • New Family Bonding. Primary caregivers can take 16 weeks off 100% paid leave, while secondary caregivers can take 4 weeks. Returning to work after bringing home a new child isn’t easy, which is why we’re flexible and empathetic to the needs of new parents.

Jetzt Bewerben

Ähnliche Jobs

🕒 vor 4 Monaten

Factorial

501 - 1000

👥 HR Tech

☁️ SaaS

🏢 Unternehmen

Databricks AI Platform Engineer focusing on software engineering and ML deployment with a strong influence on AI project delivery. Located in a remote-first company with emphasis on innovation and collaboration.

🗣️🇺🇸🇬🇧 Englisch erforderlich

🗣️🇪🇸 Spanisch erforderlich

🕒 vor 4 Monaten

Albert Invent

51 - 200

🤖 Künstliche Intelligenz

🧬 Biotechnologie

🔬 Wissenschaft

AI/ML Platform Engineer responsible for building APIs and data pipelines for AI products. Collaborating with scientists to optimize workflows and enhance experimentation capabilities.

🇺🇸 Vereinigte Staaten – Remote

💰 Seed Round im 2023-06

⏰ Vollzeit

🔴 Experte

🏗️ Plattformingenieur

🗣️🇺🇸🇬🇧 Englisch erforderlich

🕒 vor 4 Monaten

Fanatics, Inc.

1001 - 5000

🎮 Gaming

🛒 Einzelhandel

🛍️ eCommerce

Senior Platform Engineer working on cloud infrastructure, Kubernetes platforms, and internal developer tooling for Fanatics Betting & Gaming. Collaborating with various teams to enhance developer productivity.

🇺🇸 Vereinigte Staaten – Remote

💵 $129.200 - $183.600 / Jahr

⏰ Vollzeit

🟠 Senior

🏗️ Plattformingenieur

🗣️🇺🇸🇬🇧 Englisch erforderlich

🕒 vor 4 Monaten

TigerData (creators of TimescaleDB)

51 - 200

☁️ SaaS

🤖 Künstliche Intelligenz

Senior Platform Engineer specializing in PostgreSQL at Tiger Data. Designing database features and ensuring operational excellence for cloud database platforms.

🇺🇸 Vereinigte Staaten – Remote

⏰ Vollzeit

🟠 Senior

🏗️ Plattformingenieur

🗣️🇺🇸🇬🇧 Englisch erforderlich

🕒 vor 4 Monaten

PointClickCare

1001 - 5000

⚕️ Krankenversicherung

☁️ SaaS

🏢 Unternehmen

Principal AI Platform Engineer focusing on building AI infrastructure for PointClickCare's GenAI capabilities. Collaborating across engineering teams to support generative AI solutions and insights.

🇺🇸 Vereinigte Staaten – Remote

💵 $179.000 - $199.000 / Jahr

💰 Secondary Market im 2022-03

⏰ Vollzeit

🔴 Experte

🏗️ Plattformingenieur

🦅 H1B-Visum-Sponsor

info

🗣️🇺🇸🇬🇧 Englisch erforderlich