Senior SRE, Ads

Job not on LinkedIn

🔥 0 minutes ago

Apply Now
Find Similar Remote Jobs

📊 Check your resume score for this job

Improve your chances of getting an interview by checking your resume score before you apply.

Logo of Reddit, Inc.

Reddit, Inc.

501 - 1000 employees

Founded 2005

👥 B2C

📱 Media

🌍 Social Impact

B2C • Media • Social Impact

Reddit, Inc. is a social media platform that acts as a hub for thousands of communities, where users can engage in diverse conversations ranging from breaking news to niche interests. It enables users to post, comment, and vote on content, fostering a vibrant online community. Millions of people globally connect and share their passions on Reddit, creating a dynamic environment for authentic human interaction.

📋 Description

• Partner with Ads Engineering teams to improve reliability, scalability, and operational excellence of ad-serving, auction, targeting, measurement, and billing systems. • Design, build, and maintain infrastructure, tooling, and automation that improve service reliability and engineering productivity. • Improve observability through monitoring, alerting, tracing, logging, and dashboards. • Participate in on-call rotations and lead incident response efforts for critical production systems. • Run root cause analysis and drive corrective actions following incidents. • Collaborate with software engineers throughout the service lifecycle, from design reviews through production operations. • Drive adoption of SRE best practices including SLIs, SLOs, error budgets, capacity planning, and operational readiness reviews. • Reduce operational toil through automation and self-service tooling. • Help define and measure advertiser-critical user journeys such as campaign creation, ad delivery, reporting, and billing. • Scale Ads systems to support continued traffic growth, increased advertiser demand, and evolving business requirements.

🎯 Requirements

• 5+ years of experience in Site Reliability Engineering, Infrastructure Engineering, or related roles operating large scale distributed systems. • Strong experience supporting high traffic, user facing production environments. • Good understanding of distributed systems, networking, Linux systems, cloud native architectures. • Good programming skills in languages such as Go, Python, or similar. • Demonstrated ability to troubleshoot complex issues across applications, infrastructure, networking, and services. • Experience with observability platforms, monitoring systems, alerting, and incident response. • Experience driving automation and operational improvements.

🏖️ Benefits

• Global Benefit programs that fit your lifestyle, from workspace to professional development to caregiving support • Family Planning Support • Gender-Affirming Care • Mental Health & Coaching Benefits • Private Medical, Dental, and Vision Benefits • Personal Retirement Savings Account with matching contribution • Cycle to Work and Tax Saver schemes • Flexible Vacation & Paid Volunteer Time Off • Generous Paid Parental Leave

Apply Now

Similar Jobs

🕒 2 days ago

Arista Networks

1001 - 5000

🏢 Enterprise

📡 Telecommunications

Site Reliability Engineer at Arista Networks focused on maintaining scalable infrastructure. Collaborating with product dev teams to improve experience and workflow efficiency in a cloud environment.

Linux

Python

Shell Scripting

Unix

Go

🕒 2 days ago

Kaseya

1001 - 5000

🔒 Cybersecurity

☁️ SaaS

🏢 Enterprise

Senior DevSecOps Engineer integrating security throughout software development lifecycle at Kaseya. Focused on strengthening cloud and platform environment security with a dynamic team.

AWS

Azure

Cloud

Docker

Kubernetes

Python

Terraform

🕒 3 days ago

Arista Networks

1001 - 5000

🏢 Enterprise

📡 Telecommunications

Site Reliability Engineer working with Engineering Productivity team at Arista Networks. Maintain and support infrastructure while enhancing developer experience and workflow.

Linux

Python

Shell Scripting

Unix

Go

🕒 6 days ago

Arista Networks

1001 - 5000

🏢 Enterprise

📡 Telecommunications

Site Reliability Engineer at Arista Networks focused on maintaining infrastructure for product development teams. Collaborating on scalable, reliable cloud-based tools and systems in a dynamic team environment.

Linux

Python

Shell Scripting

Unix

Go

🕒 June 3

Honeycomb.io

51 - 200

☁️ SaaS

🏢 Enterprise

🤖 Artificial Intelligence

Senior Site Reliability Engineer scaling backend systems to support high-volume customers at Honeycomb. Working with AWS, Kubernetes, and various backend teams in a fully remote setting.

AWS

Kafka

Kubernetes

Terraform

Go