Site Reliability Engineer

Ähnliche Remote-Jobs finden

1001 - 5000 Mitarbeiter

Gegründet 2006

☁️ SaaS

🔒 Cybersecurity

🏢 Unternehmen

💰 €500.000.000 Private Equity Round im 2019-01

SaaS • Cybersecurity • Enterprise

Veeam Software ist ein globaler Marktführer für Datenresilienz und -schutz und bietet selbstverwaltete Datensicherungssoftware für Hybrid- und Multi-Cloud-Umgebungen. Die Veeam Data Platform stellt umfassende Lösungen für Datensicherung, Wiederherstellung und Sicherheit bereit und setzt auf Zero-Trust-Prinzipien sowie KI-gestützte Tools für Data Intelligence. Das Angebot von Veeam umfasst sichere Backup- und Storage-Services für Plattformen wie Microsoft 365, AWS und Google Cloud und unterstützt unterschiedlichste Workloads in virtuellen, physischen und SaaS-Umgebungen. Mit einem starken Ruf für Innovation und hohem Kundenvertrauen bedient Veeam ein breites Spektrum an Branchen und sorgt für Datenresilienz gegenüber Störungen wie Ransomware-Angriffen. Die Lösungen ermöglichen Unternehmen Datenfreiheit, sichere Speicherung und effizientes Management und untermauern die Position als ein führender Anbieter von Enterprise-Backup- und -Wiederherstellungssoftware weltweit.

Site Reliability Engineer

🕒 vor 2 Tagen

🌪️ Kansas – Remote

💵 $109.800 - $183.000 / Jahr

⏰ Vollzeit

🟡 Mittelstufe

🟠 Senior

⛑ DevOps- und Site Reliability Engineer (SRE)

🦅 H1B-Visum-Sponsor

🗣️🇺🇸🇬🇧 Englisch erforderlich

Azure

Cloud

Distributed Systems

Grafana

Java

JavaScript

Kubernetes

Prometheus

Terraform

TypeScript

Jetzt Bewerben

📊 Überprüfen Sie Ihre Lebenslauf-Bewertung für diese Stelle

Verbessern Sie Ihre Chancen auf ein Vorstellungsgespräch, indem Sie Ihre Lebenslauf-Bewertung vor der Bewerbung überprüfen.

Veeam Software

1001 - 5000 Mitarbeiter

Gegründet 2006

☁️ SaaS

🔒 Cybersecurity

🏢 Unternehmen

💰 €500.000.000 Private Equity Round im 2019-01

SaaS • Cybersecurity • Enterprise

Beschreibung

• Get up to speed on VDC workloads, dependencies, and operational workflows by reading code, docs, and working with SMEs. • Write and maintain runbooks, incident guides, and operational documentation. • Support knowledge transfer and contribute to onboarding materials for the team. • Participate in incident response including triage, investigation, mitigation, and postmortems. • Help implement and maintain SLIs, SLOs, and error budgets defined by the team. • Identify reliability issues during incidents or reviews and propose concrete improvements. • Support high availability and fault tolerance work on Azure, including Azure Government. • Close monitoring gaps by implementing instrumentation, alerting, and dashboards based on team standards. • Contribute to toil reduction through automation and tooling improvements. • Participate in on-call rotations. • Work with IaC, CI/CD pipelines, and deployment tooling in compliance-restricted environments. • Support testing, canary deployments, and release validation workflows. • Implement changes to infrastructure and configuration following established patterns and review processes. • Work with engineering, security, compliance, and operations teams to execute on reliability improvements. • Communicate clearly about system behavior, risk, and status — in writing and in meetings. • Raise blockers and gaps proactively; don't wait for problems to escalate.

🎯 Anforderungen

• 3+ years in Software Engineering, with at least 1 year in SRE, Platform Engineering, or DevOps working on cloud-hosted services. • Experience with cloud infrastructure on Azure or a comparable cloud provider. • Familiarity with regulated or compliance-oriented environments such as government (FedRAMP, CMMC), financial (PCI-DSS), or healthcare (HIPAA). You understand that compliance shapes what you can and can't do operationally. • Able to read and understand code well enough to investigate system behavior without always having someone walk you through it. • Experience with monitoring and observability tools (e.g., Prometheus, Grafana, OpenTelemetry, ELK stack). • Experience with IaC tools (Terraform, Terragrunt, or Pulumi) and container orchestration (Kubernetes). • Experience with CI/CD tooling such as GitHub Actions, Azure DevOps, GitLab CI, or ArgoCD. • Strong programming skills in one or more of: TypeScript/JS, Go, Java, C#, or similar. • Solid understanding of distributed systems fundamentals and networking basics. • Clear written and verbal communication skills.

🏖️ Vorteile

• Unlimited paid time off, 12 paid holidays including 4 global VeeaMe Days for self-care and 24 paid volunteer hours annually through Veeam Cares • Paid parental leave: 8 weeks for all parents, 16 weeks for birthing parents • Medical, dental, and vision coverage starting on your first day • Mental health support, therapy sessions, and digital wellness tools via our Employee Assistance Program • 401(k) retirement plan with company matching contributions • Fertility, adoption, and surrogacy support through Maven, plus paid volunteer time • AirVet: 24/7 virtual veterinary care at no cost • Legal services, identity protection, and supplemental health insurance options • Tax-advantaged spending accounts for healthcare, dependent care, and commuting • Opportunities to learn and grow through on-demand libraries (LinkedIn Learning, O’Reilly), mentoring, workshops, and learning events like our annual Global Day of Learning

Jetzt Bewerben

Ähnliche Jobs

Deployment Engineer

🕒 vor 2 Tagen

Armada

51 - 200

📡 Telekommunikation

🤖 Künstliche Intelligenz

🏢 Unternehmen

Deployment Engineer managing modular data center deployments for Armada. Executing installation, troubleshooting, and operational readiness activities in North America.

🇺🇸 Vereinigte Staaten – Remote

💵 $113.760 - $142.200 / Jahr

💰 €47.300.000 Series A im 2023-12

⏰ Vollzeit

🟡 Mittelstufe

🟠 Senior

⛑ DevOps- und Site Reliability Engineer (SRE)

🦅 H1B-Visum-Sponsor

🗣️🇺🇸🇬🇧 Englisch erforderlich

Senior Site Reliability Engineer

🕒 vor 2 Tagen

The Leaflet

11 - 50

🔌 API

Senior Site Reliability Engineer optimizing Java applications while pioneering AI-driven operations for high-traffic environments. Collaborating with teams to enhance reliability and performance across distributed systems.

🇺🇸 Vereinigte Staaten – Remote

⏰ Vollzeit

🟠 Senior

⛑ DevOps- und Site Reliability Engineer (SRE)

🗣️🇺🇸🇬🇧 Englisch erforderlich

Ansible

AWS

Azure

Cloud

Google Cloud Platform

Grafana

Java

Kubernetes

Prometheus

Python

Terraform

Senior Site Reliability Engineer

🕒 vor 2 Tagen

HavocAI

11 - 50

🤖 Künstliche Intelligenz

🔐 Sicherheit

🔧 Hardware

Senior Site Reliability Engineer at HavocAI responsible for reliability architecture and incident management. Ensuring performance, resilience, and operational maturity of mission-critical cloud services.

🇺🇸 Vereinigte Staaten – Remote

💵 $150.000 - $185.000 / Jahr

💰 Seed Round im 2024-09

⏰ Vollzeit

🟠 Senior

⛑ DevOps- und Site Reliability Engineer (SRE)

🗣️🇺🇸🇬🇧 Englisch erforderlich

Cloud

Distributed Systems

Kubernetes

Linux

Python

Senior DevOps Engineer

🕒 vor 3 Tagen

Ad Hoc LLC

501 - 1000

🏛️ Regierung

🤖 Künstliche Intelligenz

🔌 API

Senior DevOps Engineer at Ad Hoc creating scalable digital services and improving software engineering processes. Collaborating with federal agencies to enhance service delivery through technology.

🇺🇸 Vereinigte Staaten – Remote

💵 $125.000 - $140.000 / Jahr

⏰ Vollzeit

🟠 Senior

⛑ DevOps- und Site Reliability Engineer (SRE)

🗣️🇺🇸🇬🇧 Englisch erforderlich

AWS

Cloud

JavaScript

Node.js

Postgres

Senior DevSecOps Engineer

🕒 vor 3 Tagen

Generac

5001 - 10000

⚡ Energie

🔧 Hardware