Lead Site Reliability Developer – CSRE Consulting

10,000+ employees

Founded 1996

📱 Media

💰 Post-IPO Debt on 2023-01

Media • Entertainment

Live Nation Entertainment is the global leader in live entertainment, powering unforgettable experiences around the world. Artist-powered and fan-driven, Live Nation works with musicians to bring their creativity to life on stages across the globe. As the top producer of concerts, ticket seller, and brand connector to music, Live Nation's platform leads the market in these three core industries. Their mission extends beyond entertainment, aiming to uplift, inspire, and create memories through the power of live music.

Lead Site Reliability Developer – CSRE Consulting

🕒 May 1

🇬🇧 United Kingdom – Remote

⏰ Full Time

🟠 Senior

⛑ DevOps & Site Reliability Engineer (SRE)

AWS

Distributed Systems

Kubernetes

Apply Now

Find Similar Remote Jobs

📊 Check your resume score for this job

Improve your chances of getting an interview by checking your resume score before you apply.

Live Nation Entertainment

10,000+ employees

Founded 1996

📱 Media

💰 Post-IPO Debt on 2023-01

Media • Entertainment

📋 Description

• Lead consulting work from discovery through delivery by aligning stakeholders on priorities, sequencing work, and communicating measurable outcomes. • Establish working cadence and facilitate decision forums to surface risks, map dependencies, and drive clear ownership and timelines. • Align product, platform, and engineering stakeholders on reliability targets and trade-offs using SLOs and error budgets. • Partner regularly with Engineering Managers, product managers, Staff and Principal engineers, and platform leads to keep dependencies, decisions, and delivery aligned. • Identify systemic risks across shared dependencies and coordinate remediation across multiple teams to reduce recurring incidents. • Drive change adoption by embedding reliability mechanisms into partner team routines such as planning, PRRs, and on-call practices. • Design and implement reusable reliability mechanisms, templates, and tooling that can be adopted across teams. • Establish and evolve production readiness review practices with partner teams to improve launch quality and change safety. • Drive observability strategy for partner domains by improving signal quality, alerting philosophy, and operational dashboards. • Lead complex incident investigations and ensure learnings translate into durable fixes with clear owners and verification. • Lead reliability-focused design and code reviews and guide teams toward simpler, safer architectures. • Mentor Senior engineers and other consultants through pairing, reviews, and structured coaching to multiply impact. • Partner with internal platform engineering to influence roadmaps and deliver shared capabilities that accelerate SRE adoption. • Improve CSRE Consulting playbooks and operating practices based on repeated patterns observed across teams.

🎯 Requirements

• Deep practical understanding of SRE principles, including SLO governance and error budget policy in practice • Proven ability to lead cross-team technical work and influence without authority • Strong experience designing and troubleshooting distributed systems with cross-service failure modes • Experience shaping observability and alerting strategy and improving operational signal quality • Strong Kubernetes and AWS experience, including governance and cost trade-offs • Ability to design reliability automation and tooling that is reusable and adopted by multiple teams • Experience leading production readiness and resilience practices, including DR validation and controlled testing • Strong software engineering fundamentals with the ability to deliver and review high-quality changes in enterprise codebases • Advanced incident analysis skills focused on systemic risk reduction and organizational learning • Excellent communication skills, including exec-ready summaries and clear technical diagrams.

🏖️ Benefits

• Generous vacation • Healthcare • Retirement benefits • Student loan repayment • Tuition reimbursement • Six months of paid caregiver leave for new parents including fostering • Access to free live events through our exclusive employee ticketing program

Apply Now

Similar Jobs

Intermediate Site Reliability Engineer, Cloud Cost Utilization

🕒 April 24

GitLab

1001 - 5000

🤖 Artificial Intelligence

🏢 Enterprise

☁️ SaaS

Cloud Cost Utilization SRE responsible for making cloud spending actionable. Collaborating with Finance and Engineering at GitLab to optimize resource usage.

🇬🇧 United Kingdom – Remote

💰 Secondary Market on 2020-11

⏰ Full Time

🟡 Mid-level

🟠 Senior

⛑ DevOps & Site Reliability Engineer (SRE)

Ansible

AWS

Cloud

Google Cloud Platform

Grafana

Prometheus

Terraform

Site Reliability Engineer

🕒 April 22

NICE

5001 - 10000

☁️ SaaS

🤖 Artificial Intelligence

📡 Telecommunications

SRE - NOC role focuses on service reliability, incident response, and operational automation. Precision in dealing with operational toil through engineering practices for global operations at NICE.

🇬🇧 United Kingdom – Remote

⏰ Full Time

🟡 Mid-level

🟠 Senior

⛑ DevOps & Site Reliability Engineer (SRE)

🇬🇧 UK Skilled Worker Visa Sponsor

Ansible

AWS

Cloud

DNS

Docker

Grafana

Kubernetes

Linux

Prometheus

Python

Splunk

TCP/IP

Terraform

DevOps Engineer

🕒 April 21

Ripjar

51 - 200

💸 Finance

📋 Compliance

🤖 Artificial Intelligence

DevOps Engineer ensuring reliability and security of infrastructure for software combating financial crime at Ripjar. Focus on continuous improvement and automation within a remote-first team.

🇬🇧 United Kingdom – Remote

⏰ Full Time

🟡 Mid-level

🟠 Senior

⛑ DevOps & Site Reliability Engineer (SRE)

🇬🇧 UK Skilled Worker Visa Sponsor

Ansible

AWS

Azure

Cloud

Docker

JavaScript

Kubernetes

Linux

Prometheus

Python

Terraform

Lead DevOps Engineer

🕒 April 17

Recruiting.com

11 - 50

🎯 Recruiter

☁️ SaaS

🤝 B2B

Lead DevOps Engineer overseeing Azure infrastructure and CI/CD pipelines improvements at Cencora. Mentor engineers and align initiatives with business goals in the pharmaceutical consulting sector.

🇬🇧 United Kingdom – Remote

💰 Private Equity Round on 2006-06

⏰ Full Time

🟠 Senior

⛑ DevOps & Site Reliability Engineer (SRE)

Azure

Cloud

Kubernetes

Python

Terraform

Senior Site Reliability Engineer

🕒 April 9

NICE

5001 - 10000

☁️ SaaS

🤖 Artificial Intelligence

📡 Telecommunications