Senior Site Reliability Engineer

Ähnliche Remote-Jobs finden

51 - 200 Mitarbeiter

Gegründet 2016

☁️ SaaS

🏢 Unternehmen

🤖 Künstliche Intelligenz

SaaS • Enterprise • Artificial Intelligence

ClickHouse ist ein schnelles, ressourceneffizientes Echtzeit‑Data‑Warehouse und eine Open‑Source‑Datenbank, die für überragende Abfrageleistung in geschäfts‑ und zeitkritischen Anwendungen entwickelt wurde. Es ist als Cloud‑Service auf führenden Plattformen wie AWS, GCP und Azure verfügbar, bietet eine Option „Bring Your Own Cloud“ und eine breite Palette an Integrationen für den nahtlosen Betrieb in unterschiedlichen Tech‑Stacks. ClickHouse überzeugt bei Echtzeit‑Analysen, Machine Learning, Business Intelligence und Observability und ist damit eine ideale Wahl für Anwendungsfälle wie Financial Services, Fraud Detection und Gaming‑Analytics. Es unterstützt entwicklerfreundliche SQL‑Operationen, bietet kosteneffiziente Storage‑Lösungen und stellt eine Open‑Source‑Alternative zu traditionellen Datenbanken dar. Unternehmen wie Sony, Lyft, Cisco, GitLab und Twilio setzen ClickHouse wegen seiner Skalierbarkeit, Effizienz und Benutzerfreundlichkeit ein.

Senior Site Reliability Engineer

🕒 vor 3 Monaten

🇺🇸 Vereinigte Staaten – Remote

💵 $141.000 - $208.000 / Jahr

⏰ Vollzeit

🟠 Senior

⛑ DevOps- und Site Reliability Engineer (SRE)

🦅 H1B-Visum-Sponsor

🗣️🇺🇸🇬🇧 Englisch erforderlich

Ansible

AWS

Azure

Cloud

Docker

Google Cloud Platform

Kubernetes

Puppet

Python

SQL

Terraform

Jetzt Bewerben

📊 Überprüfen Sie Ihre Lebenslauf-Bewertung für diese Stelle

Verbessern Sie Ihre Chancen auf ein Vorstellungsgespräch, indem Sie Ihre Lebenslauf-Bewertung vor der Bewerbung überprüfen.

ClickHouse

51 - 200 Mitarbeiter

Gegründet 2016

☁️ SaaS

🏢 Unternehmen

🤖 Künstliche Intelligenz

SaaS • Enterprise • Artificial Intelligence

Beschreibung

• Collaborate with various engineering teams in ClickHouse to design and implement scalable, secure, and highly available systems for ClickHouse. • Establish and manage service level objectives (SLOs) and service level agreements (SLAs) for ClickHouse Cloud. • Ensure all the infrastructure components in ClickHouse Cloud (including Dataplane, Control Plane and ClickHouse Core) have monitoring and alerting in place to ensure timely detection and resolution of incidents. • Enhance and refine incident response processes and post-mortem analysis for any outages in ClickHouse Cloud including working with the support team to communicate to the impacted customers. • Continuously improve the reliability and performance of our ClickHouse services. • Plan, enable, and drive Chaos initiatives across Engineering teams, based upon internal priorities. • Manage on-call processes to respond to performance and reliability issues, and establish best practices for coordinating escalation to resolve issues and minimize downtime.

🎯 Anforderungen

• Bachelor’s or Master’s degree in Computer Science or a related field. • At least 8 years of experience in Site Reliability Engineering or a related field. • Previous experience using ClickHouse in production. • Hands on experience with Go and/or Python. • Strong knowledge of cloud computing platforms such as AWS, Azure, or Google Cloud Platform. • Excellent understanding of distributed databases and SQL, particularly ClickHouse is a major plus. • Hands on experience with container orchestration tools such as Kubernetes or Docker Swarm. • Strong experience with automation and configuration management tools such as Ansible, Terraform, or Puppet. • You are a strong problem solver and have solid production debugging skills. • You are passionate about efficiency, availability, scalability, and data governance. • You thrive in a fast paced environment, and see yourself as a partner with the business with the shared goal of moving the business forward. • You have a high level of responsibility, ownership, and accountability. • Excellent communication and interpersonal skills.

🏖️ Vorteile

• Flexible work environment - ClickHouse is a globally distributed company and remote-friendly. We currently operate in 20 countries. • Healthcare - Employer contributions towards your healthcare. • Equity in the company - Every new team member who joins our company receives stock options. • Time off - Flexible time off in the US, generous entitlement in other countries. • A $500 Home office setup if you’re a remote employee. • Global Gatherings – We believe in the power of in-person connection and offer opportunities to engage with colleagues at company-wide offsites.

Jetzt Bewerben

Ähnliche Jobs

Site Reliability Engineer – AI & ML Infrastructure, Kubernetes, AWS, Terraform

🕒 vor 3 Monaten

Deepgram

51 - 200

🤖 Künstliche Intelligenz

☁️ SaaS

🔌 API

Site Reliability Engineer managing AI/ML infrastructure for Deepgram. Architecting, building, and optimizing hybrid systems with Kubernetes, AWS, and Terraform.

🇺🇸 Vereinigte Staaten – Remote

💵 $150.000 - $220.000 / Jahr

💰 €47.000.000 Series B im 2022-11

⏰ Vollzeit

🟡 Mittelstufe

🟠 Senior

⛑ DevOps- und Site Reliability Engineer (SRE)

🦅 H1B-Visum-Sponsor

🗣️🇺🇸🇬🇧 Englisch erforderlich

AWS

Kubernetes

Python

Terraform

DevOps Engineer – Systems Focus

🕒 vor 3 Monaten

Elligint Health

51 - 200

⚕️ Krankenversicherung

🧬 Biotechnologie

DevOps Engineer optimizing Windows-based web services in AWS for healthcare organization. Collaborating on file processing and ensuring compliance with healthcare regulations.

🇺🇸 Vereinigte Staaten – Remote

⏰ Vollzeit

🟡 Mittelstufe

🟠 Senior

⛑ DevOps- und Site Reliability Engineer (SRE)

🗣️🇺🇸🇬🇧 Englisch erforderlich

AWS

EC2

Python

SQL

TCP/IP

Expert DevOps, DevSecOps, GenAI

🕒 vor 3 Monaten

Inetum

10.000+ Mitarbeiter

🤝 B2B

🏢 Unternehmen

☁️ SaaS

Expert DevOps / DevSecOps supporting Generative AI initiatives at Inetum for digital transformation in the United States. Designing high-value GenAI use cases and integrating new tools and practices.

🇺🇸 Vereinigte Staaten – Remote

💰 Post-IPO Equity im 2007-03

⏰ Vollzeit

🟠 Senior

🔴 Experte

⛑ DevOps- und Site Reliability Engineer (SRE)

🗣️🇫🇷 Französisch erforderlich

Cloud

Open Source

Site Reliability Engineering Manager

🕒 vor 3 Monaten

Flywire

1001 - 5000

💸 Finanzen

💳 Fintech

Manager II of Site Reliability Engineering at Flywire driving reliability, automation, and performance in cloud infrastructure. Collaborating with Engineering teams to achieve production excellence in a global environment.

🇺🇸 Vereinigte Staaten – Remote

💵 $160.000 - $200.000 / Jahr

💰 €60.000.000 Series F im 2021-03

⏰ Vollzeit

🟡 Mittelstufe

🟠 Senior

⛑ DevOps- und Site Reliability Engineer (SRE)

🦅 H1B-Visum-Sponsor

🗣️🇺🇸🇬🇧 Englisch erforderlich

Cloud

DevSecOps Engineer, Cloud Operations

🕒 vor 3 Monaten

NOVA Corporation

1 - 10

🤝 B2B

☁️ SaaS

DevSecOps & Cloud Operations Engineer at North Stone supporting cloud automation, monitoring, and security. Managing CI/CD pipelines and optimizing system performance across cloud platforms.

🇺🇸 Vereinigte Staaten – Remote

⏰ Vollzeit

🟡 Mittelstufe

🟠 Senior

⛑ DevOps- und Site Reliability Engineer (SRE)

🗣️🇺🇸🇬🇧 Englisch erforderlich

AWS

Azure

Cloud