Site Reliability Engineer II

201 - 500 employees

Founded 2007

🛍️ eCommerce

🏢 Enterprise

💰 $5M Series A on 2012-07

Cloud Storage • eCommerce • Enterprise

Backblaze is a cloud storage company that provides scalable and secure data backup solutions for both businesses and individuals. Their B2 Cloud Storage service offers S3 compatible object storage, allowing users to easily protect and manage their data with transparent pricing. Backblaze specializes in automatic and unlimited backup services for computer systems, ensuring data protection and recovery options for users, while also supporting integration with applications for enhanced functionality.

Site Reliability Engineer II

🕒 July 15

🇺🇸 United States – Remote

⏰ Full Time

🟢 Junior

🟡 Mid-level

⛑ DevOps & Site Reliability Engineer (SRE)

🦅 H1B Visa Sponsor

Ansible

Docker

Grafana

Jenkins

Kubernetes

Linux

Microservices

Prometheus

Python

Terraform

Apply Now

Find Similar Remote Jobs

📊 Check your resume score for this job

Improve your chances of getting an interview by checking your resume score before you apply.

Backblaze

201 - 500 employees

Founded 2007

🛍️ eCommerce

🏢 Enterprise

💰 $5M Series A on 2012-07

Cloud Storage • eCommerce • Enterprise

📋 Description

• Support the availability and durability of critical services across production environments. • Monitor service health using SLIs, SLOs, and error budgets, and escalate issues when thresholds are at risk. • Participate in on-call rotations, incident response, and post-incident reviews to drive service improvements. • Follow established ITIL/OSS processes (incident, change, problem, and capacity management). • Develop automation for common operational tasks, reducing manual intervention and toil. • Contribute to monitoring, logging, and alerting frameworks (e.g., Prometheus, Grafana, Catchpoint, ELK). • Work with CI/CD pipelines, configuration management, and infrastructure as code tools (Terraform, Ansible, Jenkins). • Write scripts (Bash, Python, Go, etc.) to improve system reliability and efficiency. • Partner with engineering, product, and operations teams to support resilient system design and operations. • Assist in capacity planning and disaster recovery exercises. • Work with vendors and service providers to troubleshoot service issues and track SLA performance. • Document systems, share learnings, and help grow a reliability-minded engineering culture. • Contribute to playbooks, runbooks, and operational documentation. • Identify recurring issues and propose long-term improvements. • Promote reliability-focused practices within development and operations teams.

🎯 Requirements

• Bachelor's degree in Computer Science, Engineering, or related field (or equivalent experience). • 2–4 years of experience in site reliability, systems engineering, or operations. • Exposure to large-scale, production-grade systems. • Solid Linux systems administration and troubleshooting skills. • Familiarity with service reliability concepts - monitoring, alerting, incident response, and root cause analysis. • Proficiency in at least one scripting language (Python, Bash, or Go). • Understanding of containers (Kubernetes, Docker) and microservices concepts. • Knowledge of incident response and operational best practices.

🏖️ Benefits

• Competitive salary • Flexible working hours • Professional development budget • Home office setup allowance • Global team events

Apply Now

Similar Jobs

DevSecOps Engineer

🕒 July 14

Raya

51 - 200

🌍 Social Impact

👥 B2C

📱 Media

DevSecOps Engineer improving AWS/EKS security and driving collaboration between DevOps and engineering teams at Raya. Focused on hardening systems and closing security findings across the platform.

🇺🇸 United States – Remote

💵 $160k - $190k / year

⏰ Full Time

🟡 Mid-level

🟠 Senior

⛑ DevOps & Site Reliability Engineer (SRE)

🦅 H1B Visa Sponsor

AWS

Cloud

Kubernetes

Terraform

Dev Sec Ops Developer

🕒 July 14

General Dynamics Information Technology

10,000+ employees

💼 Consulting

🏥 Healthcare

📦 Logistics

Deploy Automation Engineer at GDIT developing automated deployment pipelines for healthcare organizations. Collaborating with teams and ensuring continuous integration and delivery for various applications.

🇺🇸 United States – Remote

💵 $97.8k - $132.3k / year

⏰ Full Time

🟡 Mid-level

🟠 Senior

⛑ DevOps & Site Reliability Engineer (SRE)

🦅 H1B Visa Sponsor

Ansible

AWS

Cloud

Docker

Groovy

Java

Jenkins

Linux

OpenShift

Packer

Python

Terraform

Release Engineer

🕒 July 14

Nymbus

201 - 500

💼 Consulting

📣 Marketing

🏦 Banking

Release Engineer managing production deployments and stability on Nymbus platform. Partnering with Release Coordination to ensure successful deployments in a remote-first environment.

🇺🇸 United States – Remote

💵 $115k - $125k / year

⏰ Full Time

🟢 Junior

🟡 Mid-level

⛑ DevOps & Site Reliability Engineer (SRE)

🚫👨‍🎓 No degree required

🦅 H1B Visa Sponsor

AWS

Kubernetes

Forward Deployment Engineer

🕒 July 14

SambaNova Systems

201 - 500

🤖 Artificial Intelligence

🔧 Hardware

🏢 Enterprise

Forward Deployment Engineer embedding with enterprise customers to design and deploy GenAI applications. Collaborating across strategic product offerings to drive value implementation.

🇺🇸 United States – Remote

💵 $138k - $170k / year

⏰ Full Time

🟡 Mid-level

🟠 Senior

⛑ DevOps & Site Reliability Engineer (SRE)

🦅 H1B Visa Sponsor

AWS

Azure

Cloud

Docker

Google Cloud Platform

Kubernetes

Python

Infrastructure Deployment Engineer

🕒 July 14

11:11 Systems

1001 - 5000

🤝 B2B

🔒 Cybersecurity

🏢 Enterprise

Infrastructure Deployment Engineer at 11:11 Systems leading deployments of hardware infrastructure across global data centers. Coordinating cross-functional teams, optimizing workflows, and ensuring project execution.

🇺🇸 United States – Remote

💵 $94k - $130.5k / year

💰 Private equity on 2021-10

⏰ Full Time

🟡 Mid-level

🟠 Senior

⛑ DevOps & Site Reliability Engineer (SRE)

Ansible

Cloud

Python

VMware