Senior Site Reliability Engineer

Job not on LinkedIn

🔥 13 minutes ago

Apply Now
Find Similar Remote Jobs

📊 Check your resume score for this job

Improve your chances of getting an interview by checking your resume score before you apply.

Logo of Zipdev

Zipdev

51 - 200 employees

Founded 2017

🎯 Recruiter

🤝 B2B

👥 HR Tech

Recruitment • B2B • HR Tech

Zipdev is a company that specializes in providing remote talent solutions, particularly focused on hiring top Latin American professionals. Zipdev enables companies to build high-performing teams in their own time zones while reducing costs compared to local hiring. The company offers a streamlined hiring process that includes defining hiring needs, selecting top candidates, and onboarding new team members, handling payroll and HR overhead. Zipdev fills a variety of technical and professional roles, such as software engineers, project managers, and designers. The company is recognized for its ability to provide cultural alignment and scalable, flexible hiring solutions.

📋 Description

• Support observability tooling implementation (Datadog and/or Azure Monitor/App Insights) and help build SLO definitions, alert rules, and synthetic checks • Participate in a PagerDuty on-call rotation, including escalation handling and incident documentation • Build and maintain operational runbooks for incident response, rollback, and recovery scenarios • Contribute to deployment automation work (blue/green or canary patterns) and Infrastructure as Code • Work across Azure SQL and Cosmos DB environments, supporting performance and cost optimization initiatives • Collaborate closely with US-based engineers during overlapping working hours

🎯 Requirements

• 5+ years in SRE, DevOps, or cloud infrastructure roles • Strong hands-on experience with Microsoft Azure (Azure SQL, Cosmos DB, Container Apps, App Service) • Experience with observability tooling (Datadog, Azure Monitor, or similar) and on- call/incident response • Familiarity with Infrastructure as Code (Terraform preferred) • Strong written and spoken English; you'll be in daily communication with US-based team members and, at times, client stakeholders • **Availability with meaningful overlap with US Eastern or Mountain time zones** • Experience working in HIPAA-regulated environments, including handling PHI under a Business Associate Agreement (BAA) and working within least-privilege, audited access controls • Willingness to complete a healthcare-industry-standard background check prior to production access • **On-Call Expectations** • This role includes participation in a pager-based on-call rotation via PagerDuty, covering SEV- 1/SEV-2 incidents on a shared schedule with the SRE team. This is a core, required part of the role, not an occasional ask.

🏖️ Benefits

• Work remotely • Vacation: 10 business days a year • Holidays: 5 National Holidays a year • Company Holidays: 5 Company Holidays a year (Christmas Eve, Christmas Day, New Year's Eve, New Year's Day, Zipdev Day) • Parental Leave • Health Care Reimbursement • Active Lifestyle Reimbursement • Quarterly Home Office Reimbursement • Payroll Deduction Purchase Plans • Longevity Bonus • Continuous Learning Bonus • Access to Training and Professional Development Platforms • Did we mention it's REMOTE?!!

Apply Now

Similar Jobs

🔥 12 hours ago

Sensedia

501 - 1000

🔌 API

☁️ SaaS

💳 Fintech

DevOps Engineer ensuring alignment of technological choices with corporate architecture at Sensedia. Working with cloud solutions and fostering a culture of innovation and agility.

🗣️🇧🇷🇵🇹 Portuguese Required

Ansible

AWS

Chef

Cloud

Docker

ElasticSearch

Google Cloud Platform

Grafana

Kubernetes

Linux

Logstash

Prometheus

Python

Terraform

🕒 Yesterday

Oowlish

51 - 200

🤝 B2B

💳 Fintech

Senior Site Reliability Engineer responsible for maintaining business-critical production systems at Oowlish. Collaborating globally while driving reliability and operational excellence within high-availability environments.

Cloud

Python

TypeScript

Go

🕒 Yesterday

Kenlo

51 - 200

🏠 Real Estate

☁️ SaaS

🤝 B2B

Cloud Infrastructure Analyst ensuring performance, security, scalability and operational efficiency in cloud environments at Kenlo.

🗣️🇧🇷🇵🇹 Portuguese Required

Cloud

DNS

ElasticSearch

Firewalls

Google Cloud Platform

Grafana

Kubernetes

Prometheus

Redis

Terraform

🕒 Yesterday

CI&T

5001 - 10000

🤖 Artificial Intelligence

☁️ SaaS

Senior SRE / Cloud Engineer managing AI infrastructure on Oracle Cloud. Collaborating across teams for scalable and reliable systems supporting AI applications.

🗣️🇧🇷🇵🇹 Portuguese Required

Cloud

Kafka

Kubernetes

Oracle

Terraform

🕒 4 days ago

Devexperts

501 - 1000

💳 Fintech

☁️ SaaS

💸 Finance

Site Reliability Engineer ensuring stability of trading platforms at Devexperts. Collaborating with teams to deploy and maintain services effectively while fostering automation.

🗣️🇧🇷🇵🇹 Portuguese Required

Ansible

Apache

Cloud

Docker

ElasticSearch

Firewalls

Grafana

HAProxy

Linux

NGINX

OpenShift

TCP/IP

Terraform

Unix