Principal Site Reliability Engineer, SRE

51 - 200 employees

🏥 Healthcare

🎖️ Defense

🏭 Manufacturing

Healthcare • Defense • Manufacturing

Symmetrio is a full-service recruiting, staffing, and consulting company with decades of experience across various high-growth industries. They specialize in permanent placement, contract-to-hire, and staff augmentation, focusing on aligning talent with client goals and cultures. Symmetrio offers tailored recruitment solutions and advisory services in sectors such as life sciences, information technology, engineering, medical devices, logistics solutions, manufacturing, and building automation. Their team of talent acquisition experts is committed to understanding clients’ organizational needs and delivering specialized professionals to meet these challenges. With a strong emphasis on trust, understanding, and collaboration, Symmetrio aims to optimize operations and drive innovation and excellence for their clients.

Principal Site Reliability Engineer, SRE

Job not on LinkedIn

🕒 June 11

🇺🇸 United States – Remote

⏰ Full Time

🔴 Lead

⛑ DevOps & Site Reliability Engineer (SRE)

AWS

Cloud

Django

Grafana

Kubernetes

Python

Terraform

Apply Now

Find Similar Remote Jobs

📊 Check your resume score for this job

Improve your chances of getting an interview by checking your resume score before you apply.

SoluStaff

51 - 200 employees

🏥 Healthcare

🎖️ Defense

🏭 Manufacturing

Healthcare • Defense • Manufacturing

📋 Description

• Serve as the primary technical owner for production reliability across U.S. customer environments. • Investigate and resolve complex issues spanning web applications, APIs, backend services, data pipelines, cloud infrastructure, and customer integrations. • Lead production incident response efforts, coordinating cross-functional teams to restore service and minimize customer impact. • Perform root cause analysis and drive corrective actions that improve long-term system stability and resilience. • Partner with software engineering and platform teams to identify recurring reliability risks and implement sustainable solutions. • Design, configure, and validate secure customer connectivity solutions including Site-to-Site VPNs, Transit Gateway integrations, routing configurations, and secure network paths. • Support customer onboarding initiatives by troubleshooting connectivity challenges and ensuring consistent implementation processes. • Enhance platform observability through improvements in monitoring, logging, alerting, tracing, and operational dashboards. • Contribute to CI/CD, infrastructure automation, and deployment processes that improve release safety and operational consistency. • Develop operational tooling that supports incident response, troubleshooting, onboarding, and system monitoring activities. • Collaborate with engineering leadership to improve cloud architecture, scalability, security, and operational readiness. • Partner with customer-facing teams to communicate technical issues, remediation plans, and reliability improvements in a clear and effective manner. • Support compliance, security, and risk management initiatives within highly regulated healthcare environments.

🎯 Requirements

• 6+ years of hands-on experience supporting and managing AWS-based production environments. • 4+ years of experience supporting web applications and backend services (Python/Django experience strongly preferred). • Experience with AWS networking technologies including VPCs, Site-to-Site VPNs, Transit Gateways, routing, NAT gateways, and security groups. • Strong experience with Terraform and infrastructure-as-code deployment practices. • Experience with containerized environments including ECS, Fargate, Kubernetes, or similar technologies. • Experience building and supporting CI/CD pipelines and release automation processes. • Familiarity with monitoring and observability platforms such as Datadog, CloudWatch, Sentry, Grafana, or similar tools. • Experience leading production incidents, outage management, and root cause analysis initiatives. • Exposure to Windows Server environments, Active Directory, Kerberos, and enterprise infrastructure concepts is preferred. • Healthcare technology, healthcare SaaS, clinical software, or other regulated industry experience is highly preferred. • Bachelor’s degree in Computer Science, Engineering, Information Technology, or a related technical field preferred.

🏖️ Benefits

• Health Care Plan (Medical, Dental & Vision) • Retirement Plan (401k, IRA) • Paid Time Off (Vacation, Sick & Public Holidays)

Apply Now

Similar Jobs

Software Architect – DevOps, AI, LLM Focus

🕒 June 11

HBK - Hottinger Brüel & Kjær

1001 - 5000

💼 Consulting

🏥 Healthcare

📦 Logistics

Software Architect leading architectural direction on DevOps/AI/LLM technologies for ReliaSoft's cloud and desktop products. Collaborating with teams to enhance product capabilities and modernize systems.

🇺🇸 United States – Remote

💵 $100k - $130k / year

⏰ Full Time

🟠 Senior

🔴 Lead

⛑ DevOps & Site Reliability Engineer (SRE)

Cloud

Staff Site Reliability Engineer, Core AI Infrastructure

🕒 June 9

Coinbase

1001 - 5000

💼 Consulting

₿ Crypto

💸 Finance

Staff Site Reliability Engineer driving AI transformation by ensuring reliability and automation at Coinbase. Collaborating with infrastructure teams and leading critical incident responses to maintain service excellence.

🇺🇸 United States – Remote

💵 $218k - $256.5k / year

💰 $21.4M Post-IPO Equity on 2022-11

⏰ Full Time

🔴 Lead

⛑ DevOps & Site Reliability Engineer (SRE)

🦅 H1B Visa Sponsor

Ansible

AWS

Chef

Cloud

Docker

Kubernetes

Puppet

Python

Ruby

SaltStack

Terraform

Manager, Site Reliability Engineering

🕒 June 8

Aya Healthcare

5001 - 10000

🏥 Healthcare

💼 Consulting

📦 Logistics

Lead the SRE team at Aya Healthcare for enhancing product reliability and operational efficiency. Manage incident responses and AI-native operations for a top healthcare workforce solutions provider.

🇺🇸 United States – Remote

💵 $230k - $255k / year

⏰ Full Time

🟠 Senior

🔴 Lead

⛑ DevOps & Site Reliability Engineer (SRE)

🦅 H1B Visa Sponsor

AWS

Azure

Google Cloud Platform

Site Reliability Engineer

🕒 June 8

MKS2 Technologies

201 - 500

💼 Consulting

🏥 Healthcare

📦 Logistics

Site Reliability Systems Engineer working with monitoring tools to enhance VA's infrastructure reliability. Collaborating across teams to resolve outages and improve service quality for veterans.

🇺🇸 United States – Remote

⏰ Full Time

🟠 Senior

🔴 Lead

⛑ DevOps & Site Reliability Engineer (SRE)

AWS

Azure

Cloud

Java

JavaScript

Linux

Oracle

ServiceNow

Splunk

Unix

Principal DevOps Engineer

🕒 June 4

Headspace

501 - 1000

🏥 Healthcare

🧘 Wellness

⚕️ Healthcare Insurance

Principal DevOps Engineer at Headspace ensuring platform reliability for 65 million users. Lead cloud strategies and mentor engineering teams to deliver innovative features.

🇺🇸 United States – Remote

💵 $162k - $225k / year

⏰ Full Time

🔴 Lead

⛑ DevOps & Site Reliability Engineer (SRE)

🦅 H1B Visa Sponsor

AWS

Cloud

EC2

Terraform