Senior Site Reliability Engineer

Trouver des Emplois à Distance Similaires

51 - 200 employés

Fondée en 2016

☁️ SaaS

🏢 Entreprise

🤖 Intelligence artificielle

SaaS • Enterprise • Artificial Intelligence

ClickHouse est un entrepôt de données temps réel et une base de données open source, rapide et économe en ressources, conçu pour offrir des performances de requêtes supérieures pour les applications critiques et sensibles au temps. Disponible en service cloud sur les principales plateformes (AWS, GCP, Azure), avec une option "Bring Your Own Cloud" et un large éventail d’intégrations pour une exploitation fluide au sein de stacks technologiques variées. ClickHouse excelle en analytique en temps réel, machine learning, business intelligence et observabilité, ce qui en fait un choix idéal pour des cas d’usage tels que les services financiers, la détection de fraude et l’analytique des jeux vidéo. Il prend en charge des opérations SQL pensées pour les développeurs, propose des solutions de stockage économiques et constitue une alternative open source aux bases de données traditionnelles. Des entreprises comme Sony, Lyft, Cisco, GitLab et Twilio s’appuient sur ClickHouse pour sa scalabilité, son efficacité et sa simplicité d’utilisation.

Senior Site Reliability Engineer

🕒 il y a 3 mois

🇺🇸 États-Unis – Télétravail

💵 $141 000 - $208 000 / an

⏰ Temps Plein

🟠 Senior

⛑ Ingénieur DevOps & SRE

🦅 Parrain de Visa H1B

🗣️🇺🇸🇬🇧 Anglais requis

Ansible

AWS

Azure

Cloud

Docker

Google Cloud Platform

Kubernetes

Puppet

Python

SQL

Terraform

Postuler Maintenant

📊 Vérifiez votre score de CV pour ce poste

Améliorez vos chances d'obtenir un entretien en vérifiant votre score de CV avant de postuler.

ClickHouse

51 - 200 employés

Fondée en 2016

☁️ SaaS

🏢 Entreprise

🤖 Intelligence artificielle

SaaS • Enterprise • Artificial Intelligence

Description

• Collaborate with various engineering teams in ClickHouse to design and implement scalable, secure, and highly available systems for ClickHouse. • Establish and manage service level objectives (SLOs) and service level agreements (SLAs) for ClickHouse Cloud. • Ensure all the infrastructure components in ClickHouse Cloud (including Dataplane, Control Plane and ClickHouse Core) have monitoring and alerting in place to ensure timely detection and resolution of incidents. • Enhance and refine incident response processes and post-mortem analysis for any outages in ClickHouse Cloud including working with the support team to communicate to the impacted customers. • Continuously improve the reliability and performance of our ClickHouse services. • Plan, enable, and drive Chaos initiatives across Engineering teams, based upon internal priorities. • Manage on-call processes to respond to performance and reliability issues, and establish best practices for coordinating escalation to resolve issues and minimize downtime.

🎯 Exigences

• Bachelor’s or Master’s degree in Computer Science or a related field. • At least 8 years of experience in Site Reliability Engineering or a related field. • Previous experience using ClickHouse in production. • Hands on experience with Go and/or Python. • Strong knowledge of cloud computing platforms such as AWS, Azure, or Google Cloud Platform. • Excellent understanding of distributed databases and SQL, particularly ClickHouse is a major plus. • Hands on experience with container orchestration tools such as Kubernetes or Docker Swarm. • Strong experience with automation and configuration management tools such as Ansible, Terraform, or Puppet. • You are a strong problem solver and have solid production debugging skills. • You are passionate about efficiency, availability, scalability, and data governance. • You thrive in a fast paced environment, and see yourself as a partner with the business with the shared goal of moving the business forward. • You have a high level of responsibility, ownership, and accountability. • Excellent communication and interpersonal skills.

🏖️ Avantages

• Flexible work environment - ClickHouse is a globally distributed company and remote-friendly. We currently operate in 20 countries. • Healthcare - Employer contributions towards your healthcare. • Equity in the company - Every new team member who joins our company receives stock options. • Time off - Flexible time off in the US, generous entitlement in other countries. • A $500 Home office setup if you’re a remote employee. • Global Gatherings – We believe in the power of in-person connection and offer opportunities to engage with colleagues at company-wide offsites.

Postuler Maintenant

Emplois Similaires

Site Reliability Engineer – AI & ML Infrastructure, Kubernetes, AWS, Terraform

🕒 il y a 3 mois

Deepgram

51 - 200

🤖 Intelligence artificielle

☁️ SaaS

🔌 API

Site Reliability Engineer managing AI/ML infrastructure for Deepgram. Architecting, building, and optimizing hybrid systems with Kubernetes, AWS, and Terraform.

🇺🇸 États-Unis – Télétravail

💵 $150 000 - $220 000 / an

💰 €47 000 000 Series B en 2022-11

⏰ Temps Plein

🟡 Intermédiaire

🟠 Senior

⛑ Ingénieur DevOps & SRE

🦅 Parrain de Visa H1B

🗣️🇺🇸🇬🇧 Anglais requis

AWS

Kubernetes

Python

Terraform

DevOps Engineer – Systems Focus

🕒 il y a 3 mois

Elligint Health

51 - 200

⚕️ Assurance santé

🧬 Biotechnologie

DevOps Engineer optimizing Windows-based web services in AWS for healthcare organization. Collaborating on file processing and ensuring compliance with healthcare regulations.

🇺🇸 États-Unis – Télétravail

⏰ Temps Plein

🟡 Intermédiaire

🟠 Senior

⛑ Ingénieur DevOps & SRE

🗣️🇺🇸🇬🇧 Anglais requis

AWS

EC2

Python

SQL

TCP/IP

Expert DevOps / DevSecOps & GenAI H/F

🕒 il y a 3 mois

Inetum

10 000+ employés

🤝 B2B

🏢 Entreprise

☁️ SaaS

Expert DevOps / DevSecOps supporting Generative AI initiatives at Inetum for digital transformation in the United States. Designing high-value GenAI use cases and integrating new tools and practices.

🇺🇸 États-Unis – Télétravail

💰 Post-IPO Equity en 2007-03

⏰ Temps Plein

🟠 Senior

🔴 Expert

⛑ Ingénieur DevOps & SRE

Cloud

Open Source

Site Reliability Engineering Manager

🕒 il y a 3 mois

Flywire

1001 - 5000

💸 Finance

💳 Fintech

Manager II of Site Reliability Engineering at Flywire driving reliability, automation, and performance in cloud infrastructure. Collaborating with Engineering teams to achieve production excellence in a global environment.

🇺🇸 États-Unis – Télétravail

💵 $160 000 - $200 000 / an

💰 €60 000 000 Series F en 2021-03

⏰ Temps Plein

🟡 Intermédiaire

🟠 Senior

⛑ Ingénieur DevOps & SRE

🦅 Parrain de Visa H1B

🗣️🇺🇸🇬🇧 Anglais requis

Cloud

DevSecOps Engineer, Cloud Operations

🕒 il y a 3 mois

NOVA Corporation

1 - 10

🤝 B2B

☁️ SaaS

DevSecOps & Cloud Operations Engineer at North Stone supporting cloud automation, monitoring, and security. Managing CI/CD pipelines and optimizing system performance across cloud platforms.

🇺🇸 États-Unis – Télétravail

⏰ Temps Plein

🟡 Intermédiaire

🟠 Senior

⛑ Ingénieur DevOps & SRE

🗣️🇺🇸🇬🇧 Anglais requis

AWS

Azure

Cloud