Staff Site Reliability Engineer – Project Volcano

🕒 il y a 4 jours

🇺🇸 États-Unis – Télétravail

💵 $140 000 - $197 000 / an

⏰ Temps Plein

🔴 Expert

⛑ Ingénieur DevOps & SRE

🗣️🇺🇸🇬🇧 Anglais requis

Postuler Maintenant
Trouver des Emplois à Distance Similaires

📊 Vérifiez votre score de CV pour ce poste

Améliorez vos chances d'obtenir un entretien en vérifiant votre score de CV avant de postuler.

Logo of Kong Inc.

Kong Inc.

201 - 500 employés

Fondée en 2017

🔌 API

☁️ SaaS

🏢 Entreprise

💰 €100 000 000 Series D en 2021-02

API • SaaS • Enterprise

Kong Inc. est un fournisseur d'une plateforme unifiée d'API et d'IA qui permet aux organisations de créer, gérer, découvrir, gouverner et monétiser des APIs, LLMs, flux d'événements et microservices. La société propose Kong Konnect — une plateforme unique incluant une passerelle API, une passerelle IA, une passerelle d'événements, un service mesh, des outils de développement comme Insomnia, ainsi que des fonctionnalités de sécurité et de gouvernance — conçue pour aider les entreprises à moderniser l'adoption de l'IA, sécuriser les workflows agentiques et accélérer la livraison de produits pilotés par API. Kong cible les grandes organisations et les équipes de développeurs avec des options de déploiement cloud, sur site et hybrides, des services professionnels et une forte communauté open source.

Description

• Own reliability for Volcano end-to-end: Define and drive SLOs, error budgets, and incident response practices for all Volcano services — edge deployments, managed Postgres, auth, realtime, storage, and the control plane. • Architect the platform's infrastructure: Design and build the multi-region Kubernetes infrastructure, networking, and data plane that powers Volcano's edge deployment pipeline and backend-as-a-service capabilities. • Build the GitOps and CI/CD backbone: Establish deployment automation, canary pipelines, and preview environment provisioning using ArgoCD, Helm, and Terraform/Terragrunt — setting patterns the broader team will follow. • Scale managed data services: Design, operate, and harden multi-tenant PostgreSQL clusters, Redis caching layers, and object storage — with a focus on data isolation, performance, and disaster recovery. • Drive observability from day one: Instrument every Volcano service with meaningful SLIs; build dashboards, alerts, and runbooks using Datadog, Prometheus, and Grafana before services go live, not after incidents. • Lead cross-functional reliability work: Collaborate with the OCTO team, product engineering, and security to bake reliability and compliance into Volcano's architecture — not bolt it on later. • Set SRE culture and standards: Mentor engineers across Volcano's contributing teams on reliability principles; lead postmortems, define on-call practices, and build a blameless engineering culture. • Evaluate and adopt emerging technologies: Given Volcano's greenfield nature, evaluate and make architectural decisions on edge runtimes, serverless compute, vector databases, and AI-native infrastructure components.

🎯 Exigences

• BS in Computer Science or equivalent; substantial experience at Staff or Principal IC level in SRE/Platform Engineering. • Proven track record building SRE or platform engineering practices for developer-facing platforms or PaaS/SaaS products — ideally at greenfield stage. • Deep Kubernetes expertise: multi-tenant cluster design, networking (CNI, service mesh, ingress), autoscaling, and security hardening.

🏖️ Avantages

• access to healthcare benefits • a 401(k) plan • short and long term disability benefits • basic life and AD&D insurance

Postuler Maintenant

Emplois Similaires

🕒 il y a 4 jours

Lyric - Clarity in motion.

201 - 500

⚕️ Assurance santé

💳 Fintech

☁️ SaaS

Staff Azure DevOps Engineer managing Azure infrastructure for healthcare technology company. Focused on security, reliability, and scalability in a cloud-based environment.

🇺🇸 États-Unis – Télétravail

💵 $150 289 - $225 434 / an

⏰ Temps Plein

🔴 Expert

⛑ Ingénieur DevOps & SRE

🗣️🇺🇸🇬🇧 Anglais requis

🕒 il y a 5 jours

Rocket Mortgage

10 000+ employés

💸 Finance

💳 Fintech

🏠 Immobilier

Director of Data Reliability Engineering leading reliability, observability, and operational maturity for enterprise data platforms. Focused on shaping future-state data infrastructure at Rocket.

🗣️🇺🇸🇬🇧 Anglais requis

🕒 il y a 5 jours

Coinbase

1001 - 5000

₿ Crypto

💸 Finance

💳 Fintech

Staff Site Reliability Engineer driving AI transformation by ensuring reliability and automation at Coinbase. Collaborating with infrastructure teams and leading critical incident responses to maintain service excellence.

🇺🇸 États-Unis – Télétravail

💵 $218 025 - $256 500 / an

💰 €21 400 000 Post-IPO Equity en 2022-11

⏰ Temps Plein

🔴 Expert

⛑ Ingénieur DevOps & SRE

🦅 Parrain de Visa H1B

info

🗣️🇺🇸🇬🇧 Anglais requis

🕒 il y a 6 jours

Aya Healthcare

5001 - 10000

⚕️ Assurance santé

🎯 Recrutement

Lead the SRE team at Aya Healthcare for enhancing product reliability and operational efficiency. Manage incident responses and AI-native operations for a top healthcare workforce solutions provider.

🇺🇸 États-Unis – Télétravail

💵 $230 000 - $255 000 / an

⏰ Temps Plein

🟠 Senior

🔴 Expert

⛑ Ingénieur DevOps & SRE

🦅 Parrain de Visa H1B

info

🗣️🇺🇸🇬🇧 Anglais requis

🕒 il y a 6 jours

MKS2 Technologies

201 - 500

🤝 B2B

🔒 Cybersecurity

Site Reliability Systems Engineer working with monitoring tools to enhance VA's infrastructure reliability. Collaborating across teams to resolve outages and improve service quality for veterans.

🇺🇸 États-Unis – Télétravail

⏰ Temps Plein

🟠 Senior

🔴 Expert

⛑ Ingénieur DevOps & SRE

🗣️🇺🇸🇬🇧 Anglais requis