Site Reliability Engineer II

Job not on LinkedIn

November 24

Apply Now
Logo of Backblaze

Backblaze

Cloud Storage • eCommerce • Enterprise

Backblaze is a cloud storage company that provides scalable and secure data backup solutions for both businesses and individuals. Their B2 Cloud Storage service offers S3 compatible object storage, allowing users to easily protect and manage their data with transparent pricing. Backblaze specializes in automatic and unlimited backup services for computer systems, ensuring data protection and recovery options for users, while also supporting integration with applications for enhanced functionality.

201 - 500 employees

Founded 2007

🛍️ eCommerce

🏢 Enterprise

💰 $5M Series A on 2012-07

📋 Description

• Support the availability and durability of critical services across production environments. • Monitor service health using SLIs, SLOs, and error budgets, and escalate issues when thresholds are at risk. • Participate in on-call rotations, incident response, and post-incident reviews to drive service improvements. • Follow established ITIL/OSS processes (incident, change, problem, and capacity management). • Develop automation for common operational tasks, reducing manual intervention and toil. • Contribute to monitoring, logging, and alerting frameworks (e.g., Prometheus, Grafana, Catchpoint, ELK). • Work with CI/CD pipelines, configuration management, and infrastructure as code tools (Terraform, Ansible, Jenkins). • Write scripts (Bash, Python, Go, etc.) to improve system reliability and efficiency. • Partner with engineering, product, and operations teams to support resilient system design and operations. • Assist in capacity planning and disaster recovery exercises. • Work with vendors and service providers to troubleshoot service issues and track SLA performance. • Document systems, share learnings, and help grow a reliability-minded engineering culture. • Contribute to playbooks, runbooks, and operational documentation. • Identify recurring issues and propose long-term improvements. • Promote reliability-focused practices within development and operations teams.

🎯 Requirements

• Bachelor’s degree in Computer Science, Engineering, or related field (or equivalent experience). • 2–4 years of experience in site reliability, systems engineering, or operations. • Exposure to large-scale, production-grade systems. • Solid Linux systems administration and troubleshooting skills. • Familiarity with service reliability concepts - monitoring, alerting, incident response, and root cause analysis. • Proficiency in at least one scripting language (Python, Bash, or Go). • Understanding of containers (Kubernetes, Docker) and microservices concepts. • Knowledge of incident response and operational best practices. • Experience in a SaaS, service provider, or distributed systems environment.

🏖️ Benefits

• Diversity, equity, and inclusion initiatives • Professional development opportunities

Apply Now

Similar Jobs

November 18

Rackspace Technology

5001 - 10000

☁️ SaaS

Site Reliability Engineer developing scalable systems and automating processes for cloud services company. Collaborating with teams to enhance technology performance and user experience.

🇮🇳 India – Remote

⏰ Full Time

🟡 Mid-level

🟠 Senior

⛑ DevOps & Site Reliability Engineer (SRE)

November 15

AVAHI

1 - 10

🌍 Social Impact

🤝 Non-profit

🤝 B2B

AWS DevOps Engineer managing AWS infrastructure and driving migration initiatives at Avahi. Collaborating with cross-functional teams to implement DevOps best practices and mentoring junior members.

🇮🇳 India – Remote

⏰ Full Time

🟡 Mid-level

🟠 Senior

⛑ DevOps & Site Reliability Engineer (SRE)

November 14

XO Health Inc.

11 - 50

⚕️ Healthcare Insurance

☁️ SaaS

🤝 B2B

DevOps Engineer responsible for designing and maintaining scalable infrastructure for healthcare. Utilizing AWS services and CI/CD practices to bridge software engineering and infrastructure management.

🇮🇳 India – Remote

💵 ₹1.2M - ₹1.5M / year

⏰ Full Time

🟡 Mid-level

🟠 Senior

⛑ DevOps & Site Reliability Engineer (SRE)

November 14

Smart Working

51 - 200

🤝 B2B

☁️ SaaS

🎯 Recruiter

Azure DevOps Engineer responsible for designing and managing Azure environments. Working with customers on cloud transformation projects leveraging IaC, CI/CD, and DevOps best practices.

🇮🇳 India – Remote

⏰ Full Time

🟡 Mid-level

🟠 Senior

⛑ DevOps & Site Reliability Engineer (SRE)

November 12

YipitData

201 - 500

💸 Finance

🏢 Enterprise

DevOps Engineer building high-performing platform for YipitData, analyzing alternative data points for actionable insights. Collaborate with teams to maintain system stability and optimize CI/CD pipelines.

🇮🇳 India – Remote

⏰ Full Time

🟡 Mid-level

🟠 Senior

⛑ DevOps & Site Reliability Engineer (SRE)

Developed by Lior Neu-ner. I'd love to hear your feedback — Get in touch via DM or support@remoterocketship.com