Senior Machine Learning Ops Engineer

July 6

Apply Now
Logo of Overstory

Overstory

Artificial Intelligence • Energy • SaaS

Overstory is a company that provides software solutions for utilities to optimize resources, mitigate vegetation risk, and manage infrastructure more intelligently, particularly to prevent power outages and wildfires. The company offers actionable vegetation intelligence by leveraging advanced technology such as artificial intelligence and remote sensing, combining satellite and aerial imagery with local data such as asset locations and wildfire maps. Overstory aims to improve safety and reliability of power delivery by enabling data-driven decision-making and future-proofing utilities' operations programs.

📋 Description

• As a Senior Machine Learning Ops Engineer at Overstory, you will design and build the foundations of our machine learning operations, ensuring our models are reliable, maintainable, and deliver real value to customers. • You’ll help architect end-to-end systems for experiment tracking, data management, and scalable deployment. • As one of our first dedicated MLOps hires, you’ll have significant ownership and influence over our technical direction, balancing best practices with pragmatic delivery to help our teams move fast while maintaining trust and reliability in production. • You’ll also collaborate closely with data engineers, data scientists, and machine learning engineers, as well as future MLOps colleagues. • In collaboration with your data and ML colleagues, you will design, build, and maintain processes and systems such as: • automated pipelines for training, testing, and deploying ML models • experiment tracking systems for performance metrics, data and model versioning, and documentation • processes and systems for the full model lifecycle, including registries, release and rollback strategies, and scalable model serving • monitoring and alerting for prediction quality, system health, and cost optimization • You will also influence the direction of data and ML within Overstory by: • advocating for a balance between MLOps best practices and quick slices of value • aligning technical solutions with customer needs in collaborating with both engineering and product • ensuring our MLOps systems support regulatory, privacy, and security requirements

🎯 Requirements

• You love working in a remote-first, fast-moving environment where collaboration and adaptability are essential. • 8+ years of experience with designing and building production-grade ML pipelines and systems – but don’t filter yourself out if you feel you’re a strong candidate with 5+ years. • Strong knowledge of experiment tracking, model deployment strategies, data versioning, and monitoring. • Experience with ML infrastructure tools (e.g. MLflow, Kubeflow, Airflow, feature stores, model registries). • Familiarity with GCP and VertexAI preferred, but not required. • Strong communication skills and ability to align technical solutions with business goals. • Comfortable making architectural decisions and balancing best practices with practical trade-offs.

🏖️ Benefits

• To be part of truly mission-driven work that reduces wildfires, protects earth’s natural resources and helps solve our climate crisis. • Flexible working environment with a lot of autonomy. We build our work days around our lives, not the other way around. • Other benefits like a remote working budget, an educational budget and time to develop new skills. • To be surrounded by an excellent, vibrant, smart team who have each other's back and believe in a culture of openness, tolerance and respect. • Equity and a competitive salary.

Apply Now

Similar Jobs

July 6

Join Gametime to design and implement AI/ML solutions for enhancing live event experiences.

Android

AWS

Azure

Cloud

Google Cloud Platform

iOS

Python

PyTorch

Scikit-Learn

SQL

July 6

Join Gametime as a founding member to develop ML infrastructure and support engineers and scientists.

Airflow

Android

AWS

Cloud

Docker

DynamoDB

GRPC

iOS

Kafka

Python

PyTorch

Redis

Scikit-Learn

SQL

July 5

Join Symbl.ai as a Machine Learning Engineer optimizing AI systems. Drive innovation and enhance infrastructure.

AWS

Azure

Cloud

Distributed Systems

Docker

Google Cloud Platform

Kubernetes

July 5

Xometry

1001 - 5000

Lead ML Engineering at Xometry, focusing on model productionization and deployment while managing a team.

AWS

Azure

Cloud

Docker

Hadoop

Jenkins

Kubernetes

PyTorch

Scikit-Learn

Spark

Tensorflow

Built by Lior Neu-ner. I'd love to hear your feedback — Get in touch via DM or support@remoterocketship.com