MLOps, ML Platform Engineer

Job not on LinkedIn

🕒 February 26

Apply Now
Find Similar Remote Jobs

📊 Check your resume score for this job

Improve your chances of getting an interview by checking your resume score before you apply.

Logo of SumerSports

SumerSports

11 - 50 employees

Founded 2022

⚽ Sports

🤖 Artificial Intelligence

☁️ SaaS

Sports • Artificial Intelligence • SaaS

SumerSports is an AI-powered sports analytics and technology company focused on football (NFL and NCAA). Combining over 500 years of NFL experience with machine learning, SumerSports offers products such as SūmerBrain for film retrieval and multi-layered data, SūmerLive for game tracking, SūmerNFL and SūmerNCAA for roster building and team optimization, and a player-verified metrics and talent exposure platform. The company produces draft guides, analytics-driven content with former scouts and Hall of Famers, and tools that serve players, teams, and fans to improve scouting, roster decisions, and performance evaluation.

📋 Description

• Design and operate ML infrastructure: Manage data, training, serving, and inference systems for high-throughput model workflows • Build scalable pipelines: Implement reproducible training and evaluation pipelines with versioning, scheduling, and artifact tracking • Optimize compute and cost: Tune GPU and CPU workloads, manage clusters, and drive efficiency via rightsizing, spot scheduling, and caching • Serve models in production: Operate APIs for low-latency inference with autoscaling, blue-green or canary rollouts, and rollback safety • Ensure reliability and observability: Define and own SLOs; instrument pipelines and services to track latency, cost, drift, and data quality • Secure and automate: Manage IAM, secrets, and container security; automate deployment pipelines via CI/CD and infrastructure as code • Collaborate cross-functionally: Partner with research scientists and AI engineers to deliver models from experiment to production with minimal friction • Document and enable: Build templates, runbooks, and internal tooling that make ML workflows repeatable, safe, and fast

🎯 Requirements

• 4+ years of experience in ML platform, DevOps, or infrastructure engineering • Deep knowledge of Kubernetes, CI/CD, containers, and cloud infrastructure (AWS, GCP, or Azure) • Hands-on experience managing GPU clusters and training/inference pipelines • Familiarity with data orchestration and storage formats (Delta, Parquet, Polars, Spark) • Proven ability to ship and operate production ML systems with SLOs • Strong Python skills and comfort with infrastructure as code and automation • Experience with observability and cost optimization at scale

🏖️ Benefits

• Competitive Salary and Bonus Plan • Comprehensive health insurance plan • Retirement savings plan (401k) with company match • Remote working environment • A flexible, unlimited time off policy • Generous paid holiday schedule - 13 in total including Monday after the Super Bowl

Apply Now

Similar Jobs

🕒 February 26

Sully.ai

11 - 50

🤖 Artificial Intelligence

⚕️ Healthcare Insurance

☁️ SaaS

Senior Platform Engineer scaling multi-cloud infrastructure for a healthcare AI company. Automating deployments and optimizing cost-awareness for AI workloads.

AWS

Azure

Cloud

Google Cloud Platform

Kubernetes

Python

Terraform

TypeScript

🕒 February 25

RevenueBase

11 - 50

🤝 B2B

☁️ SaaS

🔌 API

Senior Data & AI Platform Engineer developing data-driven tools for AI agents on large-scale infrastructure. Focused on leveraging vector embeddings and LLM APIs within a fast-paced startup environment.

AWS

Distributed Systems

ETL

Python

🕒 February 17

Clarity AI

201 - 500

🤖 Artificial Intelligence

☁️ SaaS

💸 Finance

Senior GenAI Platform Staff Engineer responsible for creating scalable AI platforms at Clarity AI. Designing robust infrastructure for machine learning deployment and management with high efficiency.

AWS

Cloud

Docker

Google Cloud Platform

Kubernetes

Microservices

Python

🕒 February 12

SOCKET

51 - 200

📡 Telecommunications

Senior Platform Engineer developing reliable infrastructure for Socket as they scale. Collaborating closely with engineering teams to enhance system performance and deployment processes.

Cloud

Docker

Google Cloud Platform

Grafana

Kubernetes

Node.js

Postgres

Prometheus

Terraform

TypeScript

🕒 February 10

dv01

51 - 200

💸 Finance

💳 Fintech

☁️ SaaS

Build and operate AI infrastructure at dv01 to accelerate cloud-native AI development. Collaborate with teams to ensure safe and efficient deployment of AI-powered services.

C++

Cloud

Distributed Systems

ETL

Google Cloud Platform

Kubernetes

PyTorch

Terraform

Go