Senior Site Reliability Engineer

🕒 May 14

Apply Now
Find Similar Remote Jobs

📊 Check your resume score for this job

Improve your chances of getting an interview by checking your resume score before you apply.

Logo of Arctiq

Arctiq

201 - 500 employees

🏢 Enterprise

☁️ SaaS

🔐 Security

Enterprise • SaaS • Security

Arctiq is a company that specializes in providing transformative infrastructure, security, and platform engineering solutions. They focus on enterprise security, modern infrastructure, and platform engineering, helping businesses architect practical and efficient solutions using world-class technologies. Arctiq offers managed security services, cloud security, and modern infrastructure solutions such as wireless networking and hybrid cloud infrastructure. They work with various sectors including healthcare, education, government, and more to enhance connectivity and bolster security using innovative technologies. Their services cover comprehensive infrastructure and security needs, from video surveillance in schools to smart city initiatives and cloud-native solutions for industries like oil & gas.

📋 Description

• Define the strategy for Service Level Objectives (SLOs) and Error Budgets. • Design complex telemetry pipelines for full-stack observability. • Design and govern the enterprise Infrastructure as Code (IaC) standards. • Develop custom tooling to automate complex recovery procedures and system scaling. • Act as the Incident Commander for major system outages, leading the technical response and directing the Root Cause Analysis (RCA) process. • Lead the integration of security-as-code within DevSecOps pipelines, ensuring full compliance with RMF and NIST 800-53 standards. • Provide technical guidance and mentorship to Mid-Level SREs and developers, fostering a culture of reliability across the organization.

🎯 Requirements

• 7+ years of experience in SRE or DevOps, with significant experience in distributed systems. • Expertise in Go, Python, or Java and advanced knowledge of Linux internals. • Extensive experience managing production Kubernetes environments and complex cloud architectures. • Proven track record of defining and meeting SLOs for high-availability systems. • Experience navigating government Risk Management Framework (RMF) processes. • Education: Bachelor’s or Master’s degree in Computer Science or Engineering. • Certifications: CKA (Certified Kubernetes Administrator) and industry observability certification preferred

Apply Now

Similar Jobs

🕒 May 7

Simple Software Solutions Group, Inc

51 - 200

🔌 API

🛍️ eCommerce

☁️ SaaS

SRE Senior Leader driving system uptime, performance, and scalability remotely. Leading teams to define SLIs/SLOs, automate infrastructure, and manage incidents in a banking context.

AWS

Azure

Cloud

Google Cloud Platform

Linux

Python

Go

🕒 May 5

A.C.Coy Company

51 - 200

🎯 Recruiter

🤝 B2B

DevSecOps Architect designing and building a self-healing DevSecOps ecosystem with AI automation for security. Focused on ensuring rapid deployment and uncompromising security standards for U.S. Government clients.

AWS

Azure

Cloud

Google Cloud Platform

Jenkins

Kubernetes

Prometheus

Python

Terraform

Go

🕒 April 23

Simple Software Solutions Group, Inc

51 - 200

🔌 API

🛍️ eCommerce

☁️ SaaS

TechOps & SRE Lead Engineer managing AWS cloud infrastructure at Simple Solutions. Overseeing DevOps practices, reliability engineering, and leading the TechOps team.

AWS

Cloud

Distributed Systems

Docker

EC2

Grafana

Kubernetes

Linux

Prometheus

Python

Terraform

🕒 April 17

Simple Software Solutions Group, Inc

51 - 200

🔌 API

🛍️ eCommerce

☁️ SaaS

TechOps & SRE Lead Engineer for Simple Solutions focusing on cloud infrastructure and operational excellence. Leading DevOps practices for secure and scalable environments on AWS.

AWS

Cloud

Distributed Systems

Docker

EC2

Grafana

Kubernetes

Linux

Prometheus

Python

Terraform

🕒 April 7

Blenderbox, Inc.

11 - 50

🏛️ Government

Senior DevOps Engineer optimizing AWS infrastructure for a leading enrollment management platform. Seeking expertise in high-availability systems for educational technology solutions.

AWS

Django

Docker

GraphQL

Postgres

Python

Redis

Terraform