Senior Site Reliability Engineer

🕒 April 30

Apply Now
Find Similar Remote Jobs

📊 Check your resume score for this job

Improve your chances of getting an interview by checking your resume score before you apply.

Logo of Juul Labs

Juul Labs

1001 - 5000 employees

👥 B2C

🛒 Retail

🧘 Wellness

B2C • Retail • Wellness

Juul Labs is a company focused on providing alternatives to traditional combustible cigarettes through its innovative vaporizing technology. Its mission is to assist adult smokers in transitioning to less harmful nicotine delivery systems while implementing strict measures to prevent underage access to its products. Juul Labs offers a range of vaping devices and e-liquids, emphasizing safety, easy use, and conformity to legal age restrictions.

📋 Description

• A Senior Site Reliability Engineer (SRE) is expected to own the operational stability and performance of Juul’s hybrid cloud infrastructure (Nutanix, AWS/GCP). • This involves leading automation efforts, architecting for reliability, and acting as the final escalation point for critical incidents to ensure the platform is scalable and efficient. • Design, deploy, and maintain enterprise-scale Nutanix AHV clusters and Prism Central for multi-cluster management • Expert-level proficiency with Nutanix CLI (nCLI and acli) for advanced operations, troubleshooting, and automation • Develop automation scripts using Nutanix REST APIs, Python SDK, PowerShell, and Terraform for infrastructure-as-code • Create and manage VM templates, golden images, and standardized deployment catalogs for consistent provisioning • Design disaster recovery solutions using Leap, Protection Domains, cross-cluster replication, and metro clustering • Implement network micro-segmentation using Nutanix Flow and configure RBAC, encryption, and security hardening • Lead L3 troubleshooting using advanced diagnostics, log analysis (CVM, Genesis), NCC health checks, and cluster service resolution • Configure high availability, VM affinity rules, QoS policies, and optimize performance for mission-critical workloads • Manage AHV networking with OVS bridges, VLANs, bonds, LACP and implement resource reservations and workload balance. • Design, deploy, and maintain hybrid cloud infrastructure across Nutanix HCI, AWS, and GCP platforms • Architect and implement multi-cloud solutions ensuring high availability, scalability, and disaster recovery

🎯 Requirements

• 8-12+ years infrastructure experience with 8+ years in Nutanix HCI and enterprise cloud AWS/GCP) • Expert-level skills in Python, PowerShell, Bash scripting, infrastructure-as-code (Terraform/CloudFormation), and container orchestration (Kubernetes, EKS/GKE) • Proven experience managing enterprise-scale environments, hybrid cloud migrations, disaster recovery, and L3 critical incident management • Strong networking knowledge (TCP/IP, VLANs, routing, VPN), security hardening, and compliance frameworks (ITIL) • Strategic thinker with exceptional analytical and troubleshooting abilities for complex multi-layer infrastructure issues • Excellent communication skills to translate technical concepts to executives and non-technical stakeholders • Calm under pressure during critical outages with meticulous attention to security, compliance, and configuration management • Self-motivated continuous learner committed to staying current with evolving cloud technologies and automation opportunities • Available for on-call rotations with strong documentation skills and customer service orientation • Certifications (plus): Nutanix NCP/NCAP, AWS Solutions Architect Professional, AWS DevOps • Professional, GCP Professional Cloud Architect, Terraform

🏖️ Benefits

• People. Work with talented, committed and supportive teammates • Equity and performance bonuses. Every employee is a stakeholder in our success • Cell phone subsidy, commuter benefits and discounts on JUUL products • Excellent medical, dental and vision, disability, and life insurance, plus family support, wellness, legal, and employee assistance program benefits • 401(k) plan with company matching • Plus biannual discretionary performance bonuses

Apply Now

Similar Jobs

🕒 April 30

NBA

11 - 50

🏠 Real Estate

🤝 B2B

Senior Site Reliability Engineer ensuring the reliability of messaging platforms for NBA operations. Leading incident response and supporting executive teams in high-pressure environments.

SMTP

🕒 April 30

Prompt Therapy Solutions Inc

11 - 50

⚕️ Healthcare Insurance

⚡ Productivity

☁️ SaaS

Senior DevOps Engineer managing infrastructure and deployment processes for healthcare tech company Prompt Therapy. Leading a team and ensuring scalability, security, and reliability in cloud environments.

Ansible

AWS

Azure

Cloud

Docker

Google Cloud Platform

Grafana

Kubernetes

Prometheus

Python

Terraform

Go

🕒 April 29

HHAeXchange

501 - 1000

⚕️ Healthcare Insurance

☁️ SaaS

📋 Compliance

SRE Technical Project Manager at HHAeXchange creating processes for site reliability and project management. Leading teams to improve system stability, resiliency, and automation in operations.

🕒 April 29

The Home Depot

10,000+ employees

🛒 Retail

👥 B2C

Senior Software Engineer for Site Reliability Engineering at Home Depot. Building and operating internal platforms for store systems' reliability and observability.

BigQuery

Cloud

Google Cloud Platform

JavaScript

Kubernetes

Python

Selenium

Spinnaker

Terraform

TypeScript

Go

🕒 April 29

Satsuma Technology Ltd

1 - 10

🔌 API

🤖 Artificial Intelligence

🛍️ eCommerce

Senior Site Reliability Engineer managing multi-cloud infrastructure at Satsuma. Ensuring reliability, scalability, and operational posture using AI-assisted development.

AWS

Azure

Cloud

Google Cloud Platform

Grafana

Kubernetes

Terraform