Site Reliability Engineer

🕒 il y a 7 mois

🇺🇸 États-Unis – Télétravail

💵 $115 000 - $135 000 / an

⏰ Temps Plein

🟡 Intermédiaire

🟠 Senior

⛑ Ingénieur DevOps & SRE

🗣️🇺🇸🇬🇧 Anglais requis

Postuler Maintenant
Trouver des Emplois à Distance Similaires

📊 Vérifiez votre score de CV pour ce poste

Améliorez vos chances d'obtenir un entretien en vérifiant votre score de CV avant de postuler.

Logo of Aalyria

Aalyria

51 - 200 employés

📡 Télécommunications

🏢 Entreprise

☁️ SaaS

Telecommunications • Enterprise • SaaS

Aalyria est une entreprise spécialisée dans les technologies de l'espace et des communications, qui crée, organise et gère des réseaux à l'échelle planétaire en combinant des communications laser atmosphériques cohérentes sans fil (Tightbeam) avec une plateforme logicielle d'orchestration de réseau alimentée par l'IA (Spacetime). L'entreprise permet une connectivité multi-domaines et multi-orbites à travers la terre, la mer, l'air et l'espace — soutenant des constellations de satellites, des architectures 5G/NTN et des réseaux hybrides — et travaille avec des partenaires commerciaux et gouvernementaux pour déployer des matériels et logiciels pour des communications résilientes et haute-capacité.

Description

• Help design and build Aalyria's centralized observability platform, integrating and scaling tools for metrics (e.g. Prometheus), logging (e.g. Loki), and distributed tracing (e.g. Tempo/OpenTelemetry). • Define, implement, and manage a robust framework of Service Level Objectives (SLOs), Service Level Indicators (SLIs), and error budgets for our core products, ensuring we are launch-ready. • Partner with SWEs to implement observability best practices, develop standard templates and documentation, and configure tooling (e.g., OpenTelemetry libraries). • Automate the deployment, scaling, and management of the entire observability stack using Infrastructure as Code (e.g. Terraform) and GitOps principles (e.g. ArgoCD). • Partner closely with the core infrastructure team to ensure deep visibility into our Kubernetes clusters and underlying GCP and AWS environments. • Develop and lead the company's monitoring, alerting, and incident response strategy, driving a culture of proactive reliability and blameless post-mortems.

🎯 Exigences

• 4+ years of experience in an SRE or platform engineering role, with a focus on observability for large-scale, distributed compute or network systems. • Deep, hands-on expertise building, scaling, and managing observability platforms (e.g., Prometheus, Grafana, Loki/ELK, OpenTelemetry, Tempo/Jaeger, Honeycomb, etc.). • Proven experience using these tools to support performance analysis and debugging of complex distributed systems. • Strong production-level experience with Google Cloud Platform (GCP) and Kubernetes. • Experience using Infrastructure as Code (IaC) and GitOps principles (e.g., ArgoCD). • Proficiency in a systems programming language, with a strong preference for Go and Python for debugging and writing tooling. • Demonstrable experience defining, implementing, and managing SLOs, SLIs, and error budgets for production services for high availability distributed systems.

🏖️ Avantages

• Innovative Environment: Work at a cutting-edge company shaping the future of aerospace communications. • Impactful Work: Directly contribute to critical national security programs and initiatives. • Growth Opportunities: Expand your career with opportunities for professional development and advancement. • Inclusive Culture: Be part of a collaborative, supportive, and inclusive workplace where your contributions matter. • Flexibility: Flexible working arrangements including hybrid remote/in-office schedules. • Competitive salary, comprehensive benefits (401(k), dental, vision, health, life insurance), paid time off, and equity options.

Postuler Maintenant

Emplois Similaires

🕒 il y a 7 mois

AGENTIC

11 - 50

🤖 Intelligence artificielle

🤝 B2B

🏢 Entreprise

Senior DevOps Engineer / Cloud Architect designing multi-account architectures for Apex program. Mastering AWS and full-stack development with a focus on cloud-native solutions.

🇺🇸 États-Unis – Télétravail

⏰ Temps Plein

🟠 Senior

⛑ Ingénieur DevOps & SRE

🗣️🇺🇸🇬🇧 Anglais requis

🕒 il y a 7 mois

Stormlight Capital

1 - 10

💸 Finance

💳 Fintech

DevOps Engineer at Stormlight Capital optimizing infrastructure for derivatives trading operations. Ensuring systems process market data and execute trades at high performance.

🇺🇸 États-Unis – Télétravail

💵 $225 000 - $325 000 / an

⏰ Temps Plein

🟡 Intermédiaire

🟠 Senior

⛑ Ingénieur DevOps & SRE

🗣️🇺🇸🇬🇧 Anglais requis

🕒 il y a 7 mois

CloudScouts

11 - 50

🤝 B2B

🏢 Entreprise

💸 Finance

AWS DevOps Engineer designing cloud-native applications for SAP S/4HANA processes. Optimizing AWS cost/performance in fully remote work environment.

🇺🇸 États-Unis – Télétravail

⏰ Temps Plein

🟠 Senior

🔴 Expert

⛑ Ingénieur DevOps & SRE

🗣️🇺🇸🇬🇧 Anglais requis

🕒 il y a 7 mois

TaxAct

51 - 200

💸 Finance

💳 Fintech

🛍️ eCommerce

Consultant role at Taxwell helping clients with tax preparation and advocating for their needs while maintaining an inclusive atmosphere.

🗣️🇺🇸🇬🇧 Anglais requis

🕒 il y a 7 mois

Hydra Host

11 - 50

🔧 Matériel

🏢 Entreprise

🤖 Intelligence artificielle

Site Reliability Engineer ensuring high uptime and performance for cloud systems at Hydra Host. Collaborating with teams to integrate monitoring and QA tools for reliability and observability.

🇺🇸 États-Unis – Télétravail

💵 $140 000 - $200 000 / an

💰 €10 000 000 Seed Round en 2022-04

⏰ Temps Plein

🟡 Intermédiaire

🟠 Senior

⛑ Ingénieur DevOps & SRE

🗣️🇺🇸🇬🇧 Anglais requis