Software Architect, Reliability Engineering

🕒 il y a 3 mois

🗣️🇺🇸🇬🇧 Anglais requis

Postuler Maintenant
Trouver des Emplois à Distance Similaires

📊 Vérifiez votre score de CV pour ce poste

Améliorez vos chances d'obtenir un entretien en vérifiant votre score de CV avant de postuler.

Logo of Twilio

Twilio

5001 - 10000 employés

Des millions de développeurs dans le monde entier ont utilisé Twilio pour découvrir la magie des communications et améliorer toute expérience humaine. Twilio a démocratisé les canaux de communication tels que la voix, le texte, le chat, la vidéo et l'email en virtualisant l'infrastructure de communication mondiale grâce à des API suffisamment simples pour que tout développeur puisse les utiliser, mais suffisamment robustes pour alimenter les applications les plus exigeantes au monde. En intégrant les communications dans la boîte à outils de chaque développeur logiciel, Twilio permet aux innovateurs de tous les secteurs — des leaders émergents aux plus grandes organisations mondiales — de réinventer la manière dont les entreprises interagissent avec leurs clients. Fondée en 2008, Twilio compte plus de 5 000 employés dans 26 bureaux répartis dans 17 pays et continue de croître, avec son siège à San Francisco et d'autres bureaux à Atlanta, Bangalore, Berlin, Bogotá, Denver, Dublin, Paris, Prague, Hong Kong, Irvine, Londres, Madrid, Munich, Malmö, Mountain View, Redwood City, New York, São Paulo, Sydney, Melbourne, Singapour, Tallinn et Tokyo.

Description

• Partner with senior technical leaders across Twilio to set and communicate the reliability strategy, translating business goals into measurable outcomes. • Influence company-wide architectural decisions while balancing long-term vision with near-term and compliance needs. • Lead the design, implementation, and operation of scalable solutions and paved roads that enable reliable, high-traffic services; • Influence company-wide architectural decisions to focus on availability, performance, resilience, and cost efficiency using Kubernetes, AWS, Terraform, and modern observability. • Ensure integrity and quality across the service lifecycle; design fault-tolerant architectures, incident response, disaster recovery, and capacity/cost management. • Collaborate with product and cross-functional teams to identify reliability risks and convert them into actionable designs, programs, and tooling. • Establish and champion reliability practices and drive systemic improvements. • Mentor and grow engineers and technical leaders • Track and apply emerging SRE, cloud, and large-scale systems best practices; introduce pragmatic innovations that improve reliability at scale.

🎯 Exigences

• 15+ years of experience in Reliability Engineering, Software Engineering, DevOps roles with a focus on infrastructure, backend systems, and reliability, including as a principal/architect. • Strong experience in driving strategic technical decisions and defining long-term technical vision. • In-depth understanding of the role of Reliability Engineering in a large and diverse SaaS organization. • Experience driving cross-org technical architecture outcomes. • Knowledge of cloud architecture, devops practices, and large-scale systems design with microservices. • Bachelor's or Master's degree in Computer Science, Engineering, or a related field (or equivalent experience). • Strong production experience, including operational management, scaling, partitioning strategies, and tuning for performance and reliability in high-scale environments. • Hands-on experience with Kubernetes (e.g., EKS), deploying and managing stateful services, and cloud services like AWS. • Proficiency in infrastructure-as-code tools such as Terraform or CloudFormation for automating infrastructure. • Expertise in observability tools (e.g., Prometheus, Grafana, Datadog) for monitoring distributed systems and setting up alerting. • Proficient in at least one programming language (e.g., Go, Python, Java) for building automation and tooling. • Experience designing incident response processes, SLOs/SLIs, runbooks, and participating in on-call rotations. • Experience running cross-functional post-incident reviews and driving improvements. • Strong understanding of distributed systems principles, including consensus, durability, throughput, and availability tradeoffs. • Proven track record of leading reliability improvements in data-intensive or mission-critical systems and collaborating with engineering teams. • Excellent problem-solving, analytical, verbal, and written communication skills, with the ability to work in cross-functional and distributed environments. • Demonstrated leadership in mentoring teams, influencing decisions, and balancing long-term objectives with short-term needs. • Ability to influence and build effective working relationships with all levels of the organization.

🏖️ Avantages

• health care insurance • 401(k) retirement account • paid sick time • paid personal time off • paid parental leave

Postuler Maintenant

Emplois Similaires

🕒 il y a 3 mois

Knox Systems, Inc.

201 - 500

🏛️ Gouvernement

🔒 Cybersecurity

📋 Conformité

Devops Security Engineer at Knox securing cloud-native environments for U.S. government missions. Focus on preventative security, automation, and continuous compliance within FedRAMP frameworks.

🇺🇸 États-Unis – Télétravail

💵 $110 000 - $140 000 / an

🔥 Financement dans la dernière année

💰 €6 500 000 Seed en 2025-08

⏰ Temps Plein

🟡 Intermédiaire

🟠 Senior

⛑ Ingénieur DevOps & SRE

🗣️🇺🇸🇬🇧 Anglais requis

🕒 il y a 3 mois

JFrog

1001 - 5000

🏢 Entreprise

☁️ SaaS

🔐 Sécurité

Senior Professional Services DevOps Engineer designing CI/CD pipelines at JFrog. Collaborating with clients and teams to enhance DevOps experience.

🗣️🇺🇸🇬🇧 Anglais requis

🕒 il y a 3 mois

Nick AI

1 - 10

🤖 Intelligence artificielle

₿ Crypto

☁️ SaaS

Backend/DevOps Engineer managing deployments and infrastructure for AI trading platform. Responsible for security, reliability, and scaling of systems across multiple venues.

🇺🇸 États-Unis – Télétravail

⏰ Temps Plein

🟡 Intermédiaire

🟠 Senior

⛑ Ingénieur DevOps & SRE

🗣️🇺🇸🇬🇧 Anglais requis

🕒 il y a 3 mois

WorkOS

51 - 200

🔌 API

🏢 Entreprise

🤝 B2B

Site Reliability Engineer ensuring reliability and performance at WorkOS across complex systems. Leading incident response and collaborating with cross-functional teams for operational excellence.

🇺🇸 États-Unis – Télétravail

💵 $175 000 - $275 000 / an

💰 €80 000 000 Series B - WorkOS en 2022-05

⏰ Temps Plein

🟡 Intermédiaire

🟠 Senior

⛑ Ingénieur DevOps & SRE

🦅 Parrain de Visa H1B

info

🗣️🇺🇸🇬🇧 Anglais requis

🕒 il y a 3 mois

Vultr

201 - 500

🤖 Intelligence artificielle

🤝 B2B

🔧 Matériel

Senior Site Reliability Engineer at Vultr ensuring performance and reliability of cloud services for 1.5 million users. Focused on large-scale systems and infrastructure automation.

🇺🇸 États-Unis – Télétravail

💵 $120 000 - $130 000 / an

🔥 Financement dans la dernière année

💰 €329 000 000 Debt Financing - Vultr en 2025-06

⏰ Temps Plein

🟠 Senior

⛑ Ingénieur DevOps & SRE

🗣️🇺🇸🇬🇧 Anglais requis