Staff Site Reliability Engineer

🕒 January 9

Apply Now
Find Similar Remote Jobs

📊 Check your resume score for this job

Improve your chances of getting an interview by checking your resume score before you apply.

Logo of PathAI

PathAI

501 - 1000 employees

Founded 2016

🤖 Artificial Intelligence

⚕️ Healthcare Insurance

🧬 Biotechnology

💰 $165M Series C on 2021-05

Artificial Intelligence • Healthcare Insurance • Biotechnology

PathAI is a leading healthcare technology company dedicated to advancing pathology through the use of artificial intelligence. Their mission is to improve patient outcomes by providing AI-powered technology that offers valuable insights for biomarker discovery and drug development. PathAI collaborates closely with biopharma and pathology laboratories to enhance laboratory workflows and improve diagnostics. Their AI-driven pathology solutions, such as the AISight Digital Pathology Image Management System, are utilized by major laboratories and research centers worldwide, helping to power digital pathology and precision medicine initiatives. PathAI's technology is leveraged by top biopharma companies to transform drug discovery and diagnostics, making significant contributions to the fields of healthcare and biotechnology.

📋 Description

• Advancing the state of our operations by implementing SRE best practices - focusing on users, monitoring, and automation. • Engineering infrastructure patterns for cloud environments in Amazon Web Services - building in security, reliability and scalability. • Designing, building, and operating our data center to support our rapidly growing Machine Learning team. • Integrating on-premises datacenter environments with existing cloud infrastructure to create a seamless hybrid cloud environment. • Improving the reliability and resilience of our infrastructure through root-cause analysis and reviewing gaps in designs, and implementations of our infrastructure. • Participating in platform on-call rotations and assisting with urgent incident response.

🎯 Requirements

• 8+ years of relevant experience. • Automation: You work hard to eliminate toil by automating everything through scripting, configuration management tools (Ansible), and code (Python/GoLang). • You’ve built monitoring infrastructure with modern observability tools (Datadog/Grafana/Prometheus). • You’ve worked with infrastructure as code (Terraform/Cloudformation). • You’ve administered physical hardware stacks in production settings (iDRAC/IPMI/Nvidia UFM/Juniper Systems). • You’re opinionated on storage solutions and how they can be optimized for high performance workloads (Quobyte/S3/FSx/EFS). • Familiarity with modern network designs and comfort operating across network layers. • Some experience and opinions on virtualization, containerization, or container orchestration platforms. (EKS/ClusterAPI/KVM). • Operations experience: You’ve managed critical production infrastructure and are familiar with incident response, scaling, and rapid growth related challenges. • A bachelor's degree in Computer Science or equivalent experience. • An insatiable intellectual curiosity and the ability to learn quickly in a complex space. • Travel: Willingness to travel up to 25% of the time.

🏖️ Benefits

• Not Overtime Eligible • Eligible for Equity

Apply Now

Similar Jobs

🕒 December 24, 2025

Upshop

51 - 200

☁️ SaaS

🛒 Retail

🛍️ eCommerce

SRE / DevOps Manager at Upshop leading reliability and operations engineering team. Responsible for scalability, security, and performance of infrastructure.

AWS

Azure

Cloud

Docker

Google Cloud Platform

Grafana

Kubernetes

MongoDB

Prometheus

Python

Shell Scripting

Terraform

Go

🕒 November 13, 2025

FloSports

201 - 500

Staff SRE at FloSports improving developer enablement and migrating infrastructure to AWS. Leading technical architecture and critical tooling development with a focus on reliability and automation.

AWS

Google Cloud Platform

JavaScript

Kubernetes

Node.js

Terraform

Go

🕒 November 5, 2025

CloudScouts

11 - 50

🤝 B2B

🏢 Enterprise

💸 Finance

AWS DevOps Engineer designing cloud-native applications for SAP S/4HANA processes. Optimizing AWS cost/performance in fully remote work environment.

AWS

Cloud

DynamoDB

Kafka

🕒 September 24, 2025

Veeva Systems

1001 - 5000

☁️ SaaS

⚕️ Healthcare Insurance

💊 Pharmaceuticals

Lead migration and build scalable AWS infrastructure; own CI/CD and DevOps tooling at Veeva, a life sciences cloud company

Ansible

AWS

Cloud

EC2

ElasticSearch

Grafana

Groovy

Jenkins

Kubernetes

Prometheus

Terraform

🕒 September 24, 2025

Veeva Systems

1001 - 5000

☁️ SaaS

⚕️ Healthcare Insurance

💊 Pharmaceuticals

Lead design and migration of scalable AWS infrastructure and CI/CD for Veeva, a life sciences industry cloud company.

Ansible

AWS

Cloud

EC2

ElasticSearch

Grafana

Groovy

Jenkins

Kubernetes

Prometheus

Terraform