Tech Lead, ML Infrastructure

Job not on LinkedIn

🔥 4 minutes ago

Apply Now
Find Similar Remote Jobs

📊 Check your resume score for this job

Improve your chances of getting an interview by checking your resume score before you apply.

Logo of NBCUniversal

NBCUniversal

10,000+ employees

Founded 2004

📱 Media

Media • Entertainment

NBCUniversal is a leading global media and entertainment company known for creating and distributing content across a variety of platforms. With over 100 years of experience, it is a part of Comcast and encompasses brands like Peacock, NBC Sports, and many others to educate, entertain, and empower audiences around the world. The company is involved in television broadcasting, film production, and theme parks, and is also recognized for its initiatives in technology and corporate social responsibility. NBCUniversal is committed to innovation and social impact, making it a vibrant workplace for media and tech professionals.

📋 Description

• Steward the end-to-end planning, execution, and delivery of the data and training infrastructure for the organization. • Act as the primary point of contact for the team that serves infrastructure to multiple other machine learning teams. • Own the cloud spend and implement cost-tracking, resource allocation and lifecycle management. • Service mindset in making the day-to-day of ML engineers smooth, balancing engineering rigor with ease of use.

🎯 Requirements

• Undergraduate degree in Computer Science. • Proven experience as a Tech Lead in an AI/ML infrastructure team. • Prior experience in industries with complex multi-disciplinary teams such as robotics, smart grids, precision agriculture, game development or aerospace. • Prior experience having a team of around 7 direct reports, establishing ways of working, and developing them to be high performing. • High attention to detail and conscientiousness. • Ability to translate customer requests and turn them into actionable technical requirement documents in collaboration with ML engineers. • Fluency with the entire machine learning lifecycle, including storage orchestration, data provenance, distributed training orchestration and deployment. • Familiarity with Python, Git and the Unix shell. • Familiarity with collaborative tools such as Jira/Confluence, Slack, a Git server, a data platform, and observability dashboard.

🏖️ Benefits

• medical, dental and vision insurance • 401(k) • paid leave • tuition reimbursement • a variety of other discounts and perks

Apply Now

Similar Jobs

🕒 6 days ago

Sigma Software Group

1001 - 5000

🎮 Gaming

📡 Telecommunications

Senior AI/ML Engineer leading AI implementation and driving innovation in AI-powered products at Sigma Software. Collaborating with an international team to deliver production-ready solutions.

AWS

SQL

🕒 June 20

SHOP APOTHEKE EUROPE

1001 - 5000

⚕️ Healthcare Insurance

🛒 Retail

💊 Pharmaceuticals

Senior ML Engineer at Redcare Pharmacy developing production-grade machine learning systems for retail media use cases. Collaborating with teams to create scalable machine learning solutions.

Cloud

🕒 June 11

SHOP APOTHEKE EUROPE

1001 - 5000

⚕️ Healthcare Insurance

🛒 Retail

💊 Pharmaceuticals

Senior MLOps Engineer optimizing deployment and monitoring of ML/LLM solutions for Europe’s No.1 e-pharmacy. Collaborating with data scientists and engineers to ensure scalable production models.

AWS

Azure

Cloud

Docker

Google Cloud Platform

Jenkins

Kubernetes

Python

Terraform

🕒 June 10

SHOP APOTHEKE EUROPE

1001 - 5000

⚕️ Healthcare Insurance

🛒 Retail

💊 Pharmaceuticals

MLOps Engineer deploying machine learning and language model solutions at Redcare Pharmacy. Collaborating with teams to ensure scalable, efficient model deployment in production environments.

AWS

Azure

Cloud

Docker

Google Cloud Platform

Jenkins

Kubernetes

Python

Terraform

🕒 June 9

SHOP APOTHEKE EUROPE

1001 - 5000

⚕️ Healthcare Insurance

🛒 Retail

💊 Pharmaceuticals

MLOps Engineer enhancing deployment and monitoring of ML models for Redcare Pharmacy. Collaborating with teams to ensure scalable, efficient, and reliable production deployment.

AWS

Azure

Cloud

Docker

Google Cloud Platform

Jenkins

Kubernetes

Python

Terraform