Site Reliability Engineer

🕒 March 18

🇺🇸 United States – Remote

💵 $156k - $288k / year

⏰ Full Time

🟡 Mid-level

🟠 Senior

⛑ DevOps & Site Reliability Engineer (SRE)

🦅 H1B Visa Sponsor

info
Apply Now
Find Similar Remote Jobs

📊 Check your resume score for this job

Improve your chances of getting an interview by checking your resume score before you apply.

Logo of Ditto

Ditto

11 - 50 employees

Founded 2018

🔌 API

📡 Telecommunications

API • Software • Telecommunications

Ditto is a platform that enables peer-to-peer data synchronization across various devices, including mobile and IoT, even in offline modes. It provides a flexible SDK that can be integrated into existing applications for seamless data flow and real-time updates. By supporting multiple programming languages and offering automatic conflict resolution, Ditto ensures developers can modernize applications quickly while maintaining high data reliability and connectivity.

📋 Description

• Develop and maintain observability solutions using platforms like Datadog, Prometheus and Grafana • Take a leading role in incident management, including coordinating response efforts, troubleshooting issues, and identifying follow-up actions • Partner with product engineering teams to architect reliable systems, recover from incidents, and learn from mistakes • Work with teams to implement and maintain SLOs, monitoring, and alerting strategies that ensure reliability at scale • Design and implement automation and support tooling to improve system resilience, maintain operational safety and reduce operational overhead • Lead the development and maintenance of runbooks, alert definitions, and incident response procedures • Participate in on-call rotations to provide 24/7 support for critical production systems

🎯 Requirements

• 4+ years of experience in Site Reliability Engineering or similar DevOps roles focused on system reliability and incident management • 2+ years of hands-on experience architecting applications for Kubernetes, and managing Kubernetes infrastructure • Strong experience with modern monitoring stacks including Prometheus, Grafana, and Datadog • Experience in at least one systems programming language, such as Go, Rust, C, or Java • Expertise with Infrastructure as Code tools, like Terraform and Helm • Expertise with at least one major cloud service provider (AWS, GCP, Azure) • Strong communication skills, with the ability to lead incident response and effectively collaborate across teams • Willingness and experience engaging with on-call rotations and emergency response procedures • A high degree of agency and bias towards action. Identify problems and work autonomously to solve them • Excellent problem-solving skills and a methodical approach to troubleshooting complex issues

🏖️ Benefits

• Health insurance • Dental insurance • Vision insurance • Life insurance • Disability insurance • 401(k) • Flexible spending accounts • Flexible time off

Apply Now

Similar Jobs

🕒 March 18

CAKE.com

201 - 500

⚡ Productivity

☁️ SaaS

🏢 Enterprise

SRE managing scalable infrastructure for CAKE.com, ensuring seamless user experience and high traffic handling. Involves automation, monitoring, and incident resolution processes.

🇺🇸 United States – Remote

⏰ Full Time

🟡 Mid-level

🟠 Senior

⛑ DevOps & Site Reliability Engineer (SRE)

🕒 March 18

Moonlite

1 - 10

📚 Education

🏪 Marketplace

👥 B2C

Build and operate production-grade AI infrastructure for organizations running intensive computational research. Leverage deep Kubernetes expertise for high-performance workloads.

🇺🇸 United States – Remote

💵 $165k - $225k / year

⏰ Full Time

🟠 Senior

⛑ DevOps & Site Reliability Engineer (SRE)

🕒 March 18

Owner.com

201 - 500

☁️ SaaS

🤝 B2B

🏪 Marketplace

Senior DevOps Engineer evolving and operating Owner’s cloud platform. Design systems for reliability, security, and developer productivity as we scale.

🇺🇸 United States – Remote

💵 $190k - $240k / year

💰 $120M Series C - Owner on 2025-05

⏰ Full Time

🟠 Senior

⛑ DevOps & Site Reliability Engineer (SRE)

🕒 March 18

Vytwo Technologies Inc

201 - 500

🤝 B2B

🏢 Enterprise

🎯 Recruiter

Meanstack Architect with DevOps expertise for TCoE, designing scalable applications and leading technical teams in a fully remote environment.

🇺🇸 United States – Remote

💵 $45 - $50 / hour

⏰ Full Time

🟡 Mid-level

🟠 Senior

⛑ DevOps & Site Reliability Engineer (SRE)

🕒 March 17

Truv

51 - 200

Senior DevOps Engineer architecting and scaling AWS infrastructure and building observability platforms. Leading compliance projects and optimizing CI/CD pipelines in a remote setup.