Director of Platform Engineering

🕒 April 7

Apply Now
Find Similar Remote Jobs

📊 Check your resume score for this job

Improve your chances of getting an interview by checking your resume score before you apply.

Logo of Overstory

Overstory

11 - 50 employees

🤖 Artificial Intelligence

⚡ Energy

☁️ SaaS

Artificial Intelligence • Energy • SaaS

Overstory is a company that provides software solutions for utilities to optimize resources, mitigate vegetation risk, and manage infrastructure more intelligently, particularly to prevent power outages and wildfires. The company offers actionable vegetation intelligence by leveraging advanced technology such as artificial intelligence and remote sensing, combining satellite and aerial imagery with local data such as asset locations and wildfire maps. Overstory aims to improve safety and reliability of power delivery by enabling data-driven decision-making and future-proofing utilities' operations programs.

📋 Description

• Own the platform strategy across Platform, MLOps, and SRE, aligning it with company and product goals • Grow senior ICs (and eventually manages) across multiple teams, fostering strong technical leadership and healthy team cultures • Define and evolve our platform vision, including developer experience, internal tooling, CI/CD, infrastructure, observability, and reliability standards • Oversee MLOps systems that support model development, training, deployment, monitoring, and governance in production • Partner cross-functionally with Product, ML, Data, Security, and Compliance to ensure the platform meets current and future needs • Balance speed and stability, making thoughtful tradeoffs between innovation, cost, reliability, and operational excellence • Set metrics and accountability for platform performance, reliability, and developer productivity

🎯 Requirements

• 10+ years experience leading platform, infrastructure, or reliability teams in a scaling startup environment, including navigating ambiguity, rapid growth, and the transition from early-stage systems to more mature, repeatable platforms • Strong understanding of cloud-native infrastructure (GCP strongly preferred), Kubernetes, CI/CD, and modern DevOps practices • Experience supporting machine learning systems in production, including model deployment, monitoring, and lifecycle management • A track record of building reliable, scalable systems used by fast-moving product teams • Excellent people leadership skills — you know how to coach managers, grow talent, and build inclusive, high-performing teams • Strong communication skills and the ability to influence across org boundaries • Experience in data- or ML-heavy products. • Nice to have: Experience with geospatial data, image processing or mapping technology.

🏖️ Benefits

• To be part of truly mission-driven work that reduces wildfires, protects earth’s natural resources and helps solve our climate crisis. • Flexible working environment with a lot of autonomy. We build our work days around our lives, not the other way around. • Other benefits like a remote working budget, an educational budget and time to develop new skills. • To be surrounded by an excellent, vibrant, smart team who have each other's back and believe in a culture of openness, tolerance and respect. • Equity and a competitive salary.

Apply Now

Similar Jobs

🕒 April 3

Motional

1001 - 5000

🚗 Transport

🤖 Artificial Intelligence

Principal Engineer at Motional architecting and driving a hybrid cloud strategy for autonomous vehicles. Leading technical standards and ensuring reliability across all platform-layer services.

Airflow

AWS

Cloud

Grafana

Kubernetes

Prometheus

Terraform

🕒 April 3

Tango

201 - 500

🏠 Real Estate

☁️ SaaS

🏢 Enterprise

Principal Software Engineer managing cloud architecture and modernization at Tango Analytics. Leading initiatives across AWS, GCP, and Azure while supporting compliance for government cloud certifications.

AWS

Azure

Cloud

Distributed Systems

Google Cloud Platform

J2EE

Java

JavaScript

Oracle

Postgres

React

Vue.js

🕒 April 1

Natera

1001 - 5000

🧬 Biotechnology

⚕️ Healthcare Insurance

💊 Pharmaceuticals

Lead AI-Native Platform Engineer at Natera developing an AI-first development platform. Architecting infrastructure and tools that integrate Generative AI into the Development Platform.

AWS

Cloud

Kubernetes

Python

Terraform

🕒 April 1

Xometry

1001 - 5000

Principal ML Engineer part of core team delivering foundational infrastructure for AI/ML solutions at Xometry. Driving innovation and collaboration across engineering teams for marketplace products.

AWS

Cloud

Docker

Kubernetes

Microservices

Python

Terraform

🕒 April 1

Canadian Nuclear Laboratories

1001 - 5000

🔬 Science

⚡ Energy

🔐 Security

Staff Platform Engineer at Aktos building backend and DevOps solutions for a SaaS platform in credit collections. Enhancing AWS and maintaining robust systems for financial services.

AWS

Kubernetes

Postgres

Python

RDBMS

Terraform