Platform Site Reliability Engineer

SaaS • Enterprise

Nexthink is a Digital Employee Experience (DEX) platform that empowers IT teams to see, diagnose, and fix digital workplace issues. It leverages AI-powered solutions for real-time alerting, intelligent diagnostics, and automated remediation, ensuring optimization of workplace applications, collaboration tools like Teams and Zoom, and overall employee engagement. Nexthink helps organizations enhance IT efficiency, manage digital transformation, and maintain cost-effective digital work environments with measurable impact and operational excellence. The platform supports over 15 million endpoints globally, providing unparalleled visibility and automation for proactive IT management and service desk efficiency.

501 - 1000 employees

Founded 2011

☁️ SaaS

🏢 Enterprise

💰 Series D on 2021-02

Platform Site Reliability Engineer

Job not on LinkedIn

November 4

🌵 Arizona – Remote

💵 $174k - $272k / year

⏰ Full Time

🟡 Mid-level

🟠 Senior

⛑ DevOps & Site Reliability Engineer (SRE)

AWS

Azure

Cloud

Google Cloud Platform

Grafana

Kubernetes

Prometheus

Python

Terraform

Apply Now

Nexthink

SaaS • Enterprise

501 - 1000 employees

Founded 2011

☁️ SaaS

🏢 Enterprise

💰 Series D on 2021-02

📋 Description

• Design, build, and maintain the infrastructure powering our multi-tenant SaaS platform • Implement and manage cloud-native systems (AWS) using best-in-class tools and automation • Operate and enhance Kubernetes clusters, deployment pipelines, and service meshes • Establish and enforce SLOs, SLAs, and error budgets • Develop infrastructure as code (Terraform or similar) • Monitor system health, application performance, and user-facing SLAs using tools like Datadog, Prometheus, Grafana • Participate in a shared on-call rotation, responding to incidents • Work closely with software engineers to embed reliability and observability into every service

🎯 Requirements

• Minimum BS in Computer Science/Engineering • 5+ years in an SRE/platform engineering role supporting SaaS platforms • Strong hands-on experience with public cloud services (AWS, GCP, Azure) • Proficiency with Kubernetes and container-based deployment • Strong programming or scripting skills (Python, Go, Bash) • Experience with CI/CD pipelines (e.g., GitHub Actions, GitLab CI, ArgoCD) • Experience with observability stacks (Prometheus, ELK/EFK, Datadog) • Comfort with being part of a rotating on-call schedule • Strong system-level troubleshooting skills

🏖️ Benefits

• Health insurance • Dental insurance • Vision insurance • Life insurance • Long-term disability • Accidental death/personal loss coverage • Flexible Hours and unlimited vacation • 11 company-paid holidays • 3 extra days for volunteering • Hybrid work model with structured onboarding • Free access to professional training platforms • Up to 16 weeks of paid leave for birthing parents/primary caregivers • 6 weeks for secondary caregivers • 401(k) plan with up to 4% company matching contributions • Bonuses for referring successful hires after three months of continuous employment

Apply Now

Similar Jobs

SRE / Cloud Infrastructure Engineer, Azure

November 1

Alt Legal - IP Management Software

11 - 50

☁️ SaaS

🤝 B2B

Site Reliability Engineer specializing in Azure cloud infrastructure at Alt Legal. Collaborating with CTO to enhance security and reliability.

🇺🇸 United States – Remote

⏰ Full Time

🟡 Mid-level

🟠 Senior

⛑ DevOps & Site Reliability Engineer (SRE)

Azure

Cloud

Docker

Grafana

Kubernetes

Python

Terraform

AI Automations Cloud Deployment Engineer

October 31

RTX

10,000+ employees

🚀 Aerospace

AI Cloud Engineer at Raytheon Technologies leading design and optimization of scalable AI solutions on cloud platforms. Collaborating with teams to drive innovation and support mission objectives.

🇺🇸 United States – Remote

💵 $124k - $250k / year

⏰ Full Time

🟠 Senior

🔴 Lead

⛑ DevOps & Site Reliability Engineer (SRE)

AWS

Azure

Cloud

Docker

Google Cloud Platform

Java

Kubernetes

Python

Senior Platform Architect – DevOps, Solutioning

October 31

Trace3

501 - 1000

🤖 Artificial Intelligence

🔒 Cybersecurity

☁️ SaaS

Senior Platform Architect at Trace3, responsible for modern cloud platform engineering solutions. Collaborating with clients and internal business units to meet technical and business objectives.

🇺🇸 United States – Remote

💵 $200k - $250k / year

💰 Private Equity Round on 2017-06

⏰ Full Time

🟠 Senior

⛑ DevOps & Site Reliability Engineer (SRE)

🦅 H1B Visa Sponsor

AWS

Cloud

Jenkins

Kubernetes

OpenShift

Python

Ruby

Site Reliability Engineer, Platform Engineering Team

October 31

Cubist Systematic Strategies

51 - 200

💸 Finance

💳 Fintech

Site Reliability Engineer working on critical infrastructure for CubeSigner in a fintech environment. Focusing on scalability, observability, and automation.

🇺🇸 United States – Remote

⏰ Full Time

🟡 Mid-level

🟠 Senior

⛑ DevOps & Site Reliability Engineer (SRE)

🦅 H1B Visa Sponsor

AWS

Cloud

Ruby on Rails

Rust

TypeScript

Senior DevOps Engineer

October 30

RethinkFirst

51 - 200

⚕️ Healthcare Insurance

🤖 Artificial Intelligence

📚 Education

DevOps Engineer managing cloud environments and creating automation tools for Rethink First, a behavioral health tech company. Responsibilities include CI/CD pipeline deployment and incident management.

🇺🇸 United States – Remote

⏰ Full Time

🟠 Senior

⛑ DevOps & Site Reliability Engineer (SRE)

Azure

Cloud

Kubernetes

SQL

Terraform