MLOps Lead

Job not on LinkedIn

🔥 0 minutes ago

Apply Now
Find Similar Remote Jobs

📊 Check your resume score for this job

Improve your chances of getting an interview by checking your resume score before you apply.

Logo of Fundamental

Fundamental

51 - 200 employees

Founded 2024

🤖 Artificial Intelligence

🏢 Enterprise

☁️ SaaS

Artificial Intelligence • Enterprise • SaaS

Fundamental is an enterprise AI company that builds large tabular models (LTMs) such as NEXUS, pre-trained on billions of tables to detect patterns and predict outcomes from structured data. The company offers an enterprise-grade predictive analytics platform that can be deployed with minimal code or integrated deeply with cloud partners like AWS, emphasizing privacy, security, and scalability. Born from academic research and backed by major investors, Fundamental targets large organizations seeking to extract foresight from their databases and deploy predictive models at cloud scale.

📋 Description

• Lead and mentor a team of MLOps engineers, fostering technical growth and a culture of operational excellence • Define and drive the MLOps roadmap, aligning infrastructure capabilities with Research, Engineering and product objectives • Establish best practices, standards, and processes for ML infrastructure, deployment, and operations • Own technical decision-making for ML infrastructure architecture and tooling choices • Architect and oversee scalable, automated machine learning pipelines, CI/CD workflows, and orchestration frameworks • Drive the design and implementation of robust model serving infrastructure using platforms like Triton, TorchServe, TensorFlow Serving, and KServe • Define inference architecture strategy optimized for ultra-low latency and high throughput • Design and maintain feature stores, robust data pipelines, and scalable storage solutions to efficiently handle large volumes of data • Collaborate with research teams to bridge the gap between experimentation and production • Define logging, alerting, and monitoring strategy to track model performance, drift, and system reliability

🎯 Requirements

• Bachelor's or Master's degree in Computer Science, Engineering, or a related field (or equivalent practical experience) • 7+ years of experience in MLOps, with 3+ years in a technical leadership role • Strong software engineering skills in Python, with experience in Bash and/or Go • Proven track record of building and leading high-performing MLOps or infrastructure teams • Experience building and designing MLOps infrastructure from the ground up • Deep experience with MLOps platforms (MLflow, WandB, etc.) and frameworks (PyTorch, TensorFlow, etc.) • Deep experience with model serving frameworks (Triton, TorchServe, TensorFlow Serving, KServe) for high scalability and low latency inference • Experience building and managing data pipelines to support both model training and inference • Good experience with Kubernetes on a major cloud provider (AWS, GCP, or Azure) and with infrastructure as code (Terraform, Helm, GitOps) • Proficient with observability and monitoring tools (Prometheus, Grafana, Datadog, OpenTelemetry) • Excellent communication skills with ability to translate between research and production contexts.

🏖️ Benefits

• Competitive compensation with salary and equity • Comprehensive health coverage for you and your dependents • Paid parental leave for all new parents, inclusive of adoptive and surrogate journeys • Relocation support for employees moving to join the team in one of our office locations • A mission-driven, low-ego culture that values diversity of thought, ownership, and bias toward action

Apply Now

Similar Jobs

🕒 May 26

BJAK

51 - 200

🛍️ eCommerce

🏪 Marketplace

Senior Machine Learning Engineer building and shipping core ML systems for a proactive AI product. Collaborating with cross-functional teams and mentoring other engineers.

Python

PyTorch