Principal Site Reliability Engineer

1001 - 5000 employees

Founded 2012

🎲 Gambling

🎮 Gaming

👥 B2C

Gambling • Gaming • B2C

DraftKings Inc. is a digital sports entertainment company operating a leading online sportsbook, daily fantasy sports, and casino platform that delivers real-money betting and gaming experiences via web and mobile apps. It combines sports data, analytics, and content to engage fans, provides marketing and VIP/loyalty programs, and maintains global teams across engineering, product, compliance, and customer experience while emphasizing responsible gaming.

Principal Site Reliability Engineer

🔥 25 minutes ago

🇺🇸 United States – Remote

💵 $200k - $250k / year

⏰ Full Time

🔴 Lead

⛑ DevOps & Site Reliability Engineer (SRE)

AWS

Cloud

Distributed Systems

Google Cloud Platform

Kubernetes

Linux

Python

Terraform

Apply Now

Find Similar Remote Jobs

📊 Check your resume score for this job

Improve your chances of getting an interview by checking your resume score before you apply.

DraftKings Inc.

1001 - 5000 employees

Founded 2012

🎲 Gambling

🎮 Gaming

👥 B2C

Gambling • Gaming • B2C

📋 Description

• Define and execute the long-term strategy for our Kubernetes platform across Google Kubernetes Engine, Amazon Elastic Kubernetes Service, RKE2, and on-premise environments, ensuring reliability, scalability, and operational consistency. • Drive architectural decisions across critical infrastructure, including cluster lifecycle management, networking, identity and access management, observability, autoscaling, capacity planning, and cost optimization. • Lead large-scale platform initiatives across multiple engineering teams, establishing technical direction, engineering standards, and measurable outcomes that improve platform reliability and developer experience. • Establish and evolve reliability practices by defining service level objectives, service level indicators, and error budget frameworks that align platform performance with business priorities. • Build automation-first infrastructure through Infrastructure as Code, GitOps workflows, self-healing systems, and internal platform tooling that improve engineering velocity and reduce operational overhead. • Champion the responsible adoption of AI-powered engineering capabilities that improve operational efficiency, accelerate incident response, and enhance developer productivity. • Lead critical platform incidents, drive post-incident improvements, and strengthen platform resilience through automation, capacity planning, and operational excellence. • Mentor senior engineers, influence technical strategy across the organization, and elevate engineering excellence through architecture reviews, coaching, and technical leadership.

🎯 Requirements

• A Bachelor's Degree in Computer Science or a related technical field. • At least 8 years of experience designing, operating, and scaling distributed cloud and on-premise infrastructure, including at least 3 years operating at the Staff, Principal, or equivalent technical leadership level. • Proven experience leading large-scale infrastructure or platform initiatives that require cross-functional alignment and long-term technical ownership. • Deep expertise with Kubernetes, including cluster architecture, networking, storage, security, operators, lifecycle management, and large-scale production operations. • Extensive experience building and operating production infrastructure in AWS and Google Cloud Platform using Infrastructure as Code technologies such as Terraform, Pulumi, or similar tools. • Strong software development experience in Go, Python, or both, with expertise in GitOps, continuous integration and continuous delivery, observability, distributed systems, Linux, and reliability engineering principles. • Experience incorporating AI-powered tools into engineering workflows while applying sound judgment around reliability, security, and operational risk. • Exceptional communication and leadership skills with a proven ability to mentor engineers, influence technical strategy, and drive engineering excellence. • Experience working in regulated industries, hybrid cloud environments, contributing to open-source projects, or holding cloud certifications is preferred.

🏖️ Benefits

• bonus • equity • benefits as applicable

Apply Now

Similar Jobs

Principal Site Reliability Engineer

🔥 2 hours ago

DraftKings Inc.

1001 - 5000

🎮 Gaming

⚽ Sports

👥 B2C

Principal Site Reliability Engineer shaping the strategy for Kubernetes platform and driving architectural decisions. Leading platform initiatives at DraftKings with a focus on reliability and automation.

🇺🇸 United States – Remote

💵 $200k - $250k / year

⏰ Full Time

🔴 Lead

⛑ DevOps & Site Reliability Engineer (SRE)

AWS

Cloud

Distributed Systems

Google Cloud Platform

Kubernetes

Linux

Python

Terraform

Director of DevOps

🔥 6 hours ago

Convoso

201 - 500

🤝 B2B

Director of DevOps leading a team of engineers at Convoso, an AI-powered contact center platform. Responsible for developing and optimizing the platform and ensuring service reliability.

🇺🇸 United States – Remote

💵 $220k - $260k / year

⏰ Full Time

🔴 Lead

⛑ DevOps & Site Reliability Engineer (SRE)

Ansible

AWS

Chef

Cloud

Docker

Google Cloud Platform

Jenkins

Kubernetes

Linux

Puppet

SaltStack

SDLC

Principal Operations Engineer – Reliability, Data Center Operations

🔥 11 hours ago

FluidStack

11 - 50

🤖 Artificial Intelligence

Principal Operations Engineer overseeing critical operations in data centers for Fluidstack. Leading on-call escalation, root cause analysis, and operational excellence in real-time situations.

🇺🇸 United States – Remote

💵 $150k - $250k / year

⏰ Full Time

🔴 Lead

⛑ DevOps & Site Reliability Engineer (SRE)

Site Reliability Engineer

🕒 2 days ago

General Dynamics Information Technology

10,000+ employees

🔒 Cybersecurity

🤖 Artificial Intelligence

Site Reliability Engineer blending software engineering, automation, and operations expertise. Building scalable platforms and enabling high-velocity delivery for critical Defense systems.

🇺🇸 United States – Remote

💵 $164.4k - $215.1k / year

⏰ Full Time

🟠 Senior

🔴 Lead

⛑ DevOps & Site Reliability Engineer (SRE)

🦅 H1B Visa Sponsor

Cloud

Distributed Systems

Grafana

Kubernetes

Linux

Prometheus

Python

Splunk

Staff Software Engineer, Core Reliability

🕒 4 days ago

Coinbase

1001 - 5000

₿ Crypto

💸 Finance

💳 Fintech