Principal MLOps Engineer

🕒 April 20

Apply Now
Find Similar Remote Jobs

📊 Check your resume score for this job

Improve your chances of getting an interview by checking your resume score before you apply.

Logo of Raft

Raft

51 - 200 employees

🤖 Artificial Intelligence

🏛️ Government

☁️ SaaS

Artificial Intelligence • Government • SaaS

Raft is a company that partners with public agencies to solve complex problems affecting the lives of millions of Americans. Specializing in cutting-edge digital solutions, Raft focuses on data and AI, digital platforms at scale, and complex software applications. The company emphasizes software architecture, UX/UI design, and automated testing to modernize legacy applications and data systems for speed, security, and scalability. Raft also implements sustainable data governance strategies and human-centered AI systems to enhance decision-making. As a government and commercial partner in advancing technology solutions, Raft is dedicated to empowering organizations with products that prioritize user outcomes over features.

📋 Description

• Design, build, and maintain secure, scalable MLOps infrastructure and deployment pipelines for production ML systems • Help mature Raft’s internal ML platform and model lifecycle capabilities, including model packaging, registry/catalog workflows, deployment, monitoring, and operational support • Deploy and manage machine learning workloads on Kubernetes, including GPU-enabled clusters • Support model serving and inference infrastructure for a range of ML use cases, including traditional ML, computer vision, speech/audio, and LLM-based systems • Build and maintain CI/CD workflows for ML services, model artifacts, and platform components • Partner closely with ML engineers, software engineers, and product teams to move models from experimentation to reliable operational deployment • Improve observability, reliability, security, and maintainability across ML infrastructure and services • Help evaluate and standardize runtime patterns, serving frameworks, and deployment architectures for production ML workloads • Contribute to infrastructure decisions across edge, on-prem, and cloud-hosted deployment environments • Support compliance-driven deployment practices and secure software supply chain requirements in defense environments • Get hands-on with customers at the most forward-leaning places in the Department of War

🎯 Requirements

• 7+ years of relevant hands-on experience in software engineering, platform engineering, DevOps, MLOps, or related technical roles • 5+ years of experience with Docker and Kubernetes in production environments • 5+ years of experience supporting enterprise cloud infrastructure or applications in AWS, Azure, or similar environments • Strong experience provisioning, operating, and troubleshooting Kubernetes clusters in production • Experience building and maintaining machine learning platforms, infrastructure, or pipelines used by engineering or data science teams • Practical experience deploying machine learning workloads on Kubernetes • Experience managing clusters or workloads that use GPUs • Strong understanding of Helm and Kubernetes deployment patterns • Strong scripting or programming skills, preferably in Python • Experience with modern software engineering practices including Git, CI/CD, DevOps, and Agile/Scrum workflows • Strong troubleshooting, systems thinking, and communication skills • Ability to work independently and collaboratively in a fast-moving environment • Ability to obtain and maintain a Top Secret clearance • Ability to obtain Security+ certification within the first 90 days of employment.

🏖️ Benefits

• Highly competitive salary • Fully covered healthcare, dental, and vision coverage • 401(k) and company match • Take as you need PTO + 11 paid holidays • Education & training benefits • Annual budget for your tech/gadgets needs • Monthly box of yummy snacks to eat while doing meaningful work • Remote, hybrid, and flexible work options • Team off-site in fun places! • Generous Referral Bonuses • And More!

Apply Now

Similar Jobs

🕒 April 16

PrizePicks

201 - 500

🎮 Gaming

⚽ Sports

Lead technical charge in scaling and productionizing ML capabilities at PrizePicks. Directly impacting metrics like Time-to-Bet and Deposit Velocity in the DFS industry.

🕒 April 16

SentiLink

51 - 200

🔐 Security

💳 Fintech

💸 Finance

Leading a team of data scientists to build fraud detection products at SentiLink. Focus on model development and mentoring in a rapidly growing environment.

🕒 April 12

Paramount

10,000+ employees

📱 Media

👥 B2C

Principal Machine Learning Engineer leading high-impact ML initiatives for Paramount Streaming. Focused on optimizing sign-up flows and delivering a personalized viewer experience through advanced modeling techniques.

🕒 April 10

CD PROJEKT SA

501 - 1000

🎮 Gaming

Machine Learning & Game Tech Architect at CD PROJEKT RED transforming gaming experiences through AI and game system integration. Responsible for hybrid systems architecture and team collaboration.

🕒 April 10

Paramount

10,000+ employees

📱 Media

👥 B2C

Principal Machine Learning Engineer leading the ShortForm & Video Intelligence pod at Paramount. Overseeing multi-modal ML models to enhance content engagement and safety.