Senior DevOps Engineer

🕒 il y a 4 mois

🇺🇸 États-Unis – Télétravail

⏰ Temps Plein

🟠 Senior

⛑ Ingénieur DevOps & SRE

🗣️🇺🇸🇬🇧 Anglais requis

Postuler Maintenant
Trouver des Emplois à Distance Similaires

📊 Vérifiez votre score de CV pour ce poste

Améliorez vos chances d'obtenir un entretien en vérifiant votre score de CV avant de postuler.

Logo of Shuru

Shuru

51 - 200 employés

Fondée en 2021

🤖 Intelligence artificielle

🤝 B2B

🏢 Entreprise

Artificial Intelligence • B2B • Enterprise

Shuru est une société de conseil en produits, intelligence artificielle et technologies qui s'associe à des entreprises pour offrir des services de conseil stratégique, de développement complet de produits et de logiciels sur mesure, ainsi qu'une extension d'équipe d'ingénierie sur mesure. Leurs équipes d'ingénierie, natives de l'IA, construisent des applications d'IA évolutives, de l'ingénierie et de l'analyse des données, du cloud/DevOps et des intégrations API pour moderniser les systèmes et accélérer la livraison des produits. Shuru opère à l'échelle mondiale avec un modèle axé sur le télétravail et met l'accent sur une grande responsabilité, le design thinking et des résultats mesurables pour ses clients grandes entreprises et startups.

Description

• Kubernetes platform engineering (EKS-first) ● Design, build, and operate production-grade Kubernetes clusters (multi-nodegroup, autoscaling, upgrades, cluster add-ons). • Implement intelligent autoscaling using real metrics (queue depth, consumer lag, service latency) via tools like KEDA/Karpenter. • Own AWS environments end-to-end (VPC, IAM, EKS/ECS/EC2, ALB/ELB, S3, Route53, CloudWatch, RDS, SQS, Lambda). • Build reproducible infrastructure using Terraform, with strong review + change management practices. • Implement backup/DR patterns (e.g., snapshots, retention, automation) and safe rollouts. • Design infrastructure for data-intensive workloads: high-throughput ingestion, batch processing, and real-time streaming. • Understand and operate distributed systems at scale — consensus, partitioning, replication, and failure modes. • Build and maintain infrastructure for data pipelines, vector databases. • Design for horizontal scalability, ensuring systems handle growing data volumes and user traffic gracefully. • Build/own monitoring + logging from scratch and make it actionable (Prometheus/Grafana, ELK/EFK, alerting). • Define/partner on SLI/SLOs and incident response practices; improve reliability with data-driven changes. • Establish performance testing and production-like load testing environments. • Continuously reduce AWS spend via right-sizing, Spot strategies, reserved capacity planning, and architecture improvements. • Partner with engineering teams to diagnose bottlenecks (db queries, caching, queueing) and propose scalable solutions. • Optimize infrastructure costs for data-heavy workloads (storage tiering, compute scheduling, GPU utilization). • Improve cloud and cluster security posture (IAM, network policies, secrets management, least privilege). • Support SOC2 readiness/execution (controls, evidence automation, operational hardening). • Implement access management patterns.

🎯 Exigences

• 7+ years in DevOps / SRE / Cloud Infra roles operating production systems. • Deep hands-on experience with Kubernetes in production. • Strong AWS fundamentals across compute/networking/storage/identity, including VPC, IAM, EC2/EKS, ALB, S3, Route53, CloudWatch, RDS, SQS. • Proven ability to build infra using Terraform (and strong IaC practices). • Production-grade observability experience: Prometheus + Grafana, and centralized logging (ELK/EFK or similar). • Experience scaling product infrastructure — you've grown systems from thousands to millions of requests, and understand capacity planning, bottleneck identification, and scaling patterns. • Solid understanding of distributed systems concepts: CAP theorem, consistency models, partitioning strategies, distributed consensus, and failure handling. • Strong understanding of databases and performance fundamentals. • CI/CD experience building reliable pipelines (Jenkins/Spinnaker/GitHub Actions equivalents), with safe deployment strategies. • Scripting/automation ability in Python and/or Bash (Go is a plus).

🏖️ Avantages

• Competitive salary and benefits package. • Opportunity to work with a team of experienced product and tech leaders. • A flexible work environment with remote working options. • Continuous learning and development opportunities. • Chance to make a significant impact on diverse and innovative projects.

Postuler Maintenant

Emplois Similaires

🕒 il y a 4 mois

Signalmash

51 - 200

📡 Télécommunications

🔌 API

☁️ SaaS

DevOps Engineer managing production infrastructure and optimizing cloud operations for a telecom+AI company. Design, deploy, and maintain applications while ensuring security and cost efficiency.

🇺🇸 États-Unis – Télétravail

⏰ Temps Plein

🟡 Intermédiaire

🟠 Senior

⛑ Ingénieur DevOps & SRE

🗣️🇺🇸🇬🇧 Anglais requis

🕒 il y a 4 mois

Sardine

51 - 200

🔒 Cybersecurity

📋 Conformité

💳 Fintech

DevOps Engineer at Sardine evolving infrastructure and platform tooling for fraud prevention. Collaborating cross-functionally to enhance reliability, scalability, and cost efficiency.

🇺🇸 États-Unis – Télétravail

💵 $160 000 - $200 000 / an

⏰ Temps Plein

🟡 Intermédiaire

🟠 Senior

⛑ Ingénieur DevOps & SRE

🗣️🇺🇸🇬🇧 Anglais requis

🕒 il y a 4 mois

Dealer Tire

1001 - 5000

🛒 Commerce de détail

🤝 B2B

Integration Platform Reliability Engineer developing and supporting EDI/Integration platforms for Dealer Tire. Collaborating with various teams to ensure operational stability and system documentation.

🇺🇸 États-Unis – Télétravail

💵 $81 000 - $90 000 / an

💰 €157 900 000 Private Equity Round en 2009-09

⏰ Temps Plein

🟡 Intermédiaire

🟠 Senior

⛑ Ingénieur DevOps & SRE

🦅 Parrain de Visa H1B

info

🗣️🇺🇸🇬🇧 Anglais requis

🕒 il y a 4 mois

OpenAI

201 - 500

🤖 Intelligence artificielle

☁️ SaaS

🏢 Entreprise

AI Deployment Engineer implementing and scaling AI coding tools for OpenAI's clients. Collaborating with engineering teams to enhance productivity through innovative AI solutions.

🇺🇸 États-Unis – Télétravail

💵 $176 000 - $224 000 / an

⏰ Temps Plein

🟡 Intermédiaire

🟠 Senior

⛑ Ingénieur DevOps & SRE

🦅 Parrain de Visa H1B

info

🗣️🇺🇸🇬🇧 Anglais requis

🕒 il y a 5 mois

MetaMask

51 - 200

₿ Crypto

🌐 Web 3

💳 Fintech

DevSecOps Engineer at Consensys working on MetaMask and Infura platforms involving code deployment and infrastructure management. Collaborating with cross-functional teams to enhance cybersecurity and drive development.

🇺🇸 États-Unis – Télétravail

💵 $160 000 - $218 000 / an

⏰ Temps Plein

🟠 Senior

⛑ Ingénieur DevOps & SRE

🗣️🇺🇸🇬🇧 Anglais requis