Senior Engineer, Site Reliability

🔥 0 minutes ago

Apply Now
Find Similar Remote Jobs

📊 Check your resume score for this job

Improve your chances of getting an interview by checking your resume score before you apply.

Logo of Syniti

Syniti

1001 - 5000 employees

🤝 B2B

🏢 Enterprise

💰 Private Equity Round on 2017-08

B2B • Data Management • Enterprise

Syniti is a global team of over 1,300 professionals dedicated to helping enterprises transform complex data challenges into competitive advantages. With more than 25 years of experience in enterprise data management, Syniti partners with Fortune 500 companies and operates from 30 offices across 18 countries. The company is committed to investing in the success of its clients through innovative tools and methodologies, positioning itself as a trusted problem solver in the data management space.

📋 Description

• Design and build automated cloud infrastructure for Syniti-hosted SaaS workloads across Azure and AWS. • Implement and manage CI/CD pipelines using GitHub Actions and ArgoCD GitOps workflows for multi-environment deployments (dev, preprod, prod, GovCloud). • Develop and maintain Terraform modules for infrastructure provisioning, including EKS/AKS clusters, Aurora PostgreSQL, Redis, OpenSearch, and S3/Azure Storage. • Integrate and maintain observability frameworks (Prometheus, Grafana, Loki, Mimir, Jaeger) and enforce structured logging of application, audit, and security events. • Support and extend Istio service mesh configurations including mTLS policies, AuthorizationPolicies, and SPIFFE-based workload identity. • Implement and maintain supply chain security controls: container image signing (Cosign), SBOM generation (Syft), provenance attestation (SLSA), and vulnerability scanning (Trivy, Inspector, Snyk). • Collaborate with security teams to meet FedRAMP High, Cyber Essentials+, NIST 800-53, and SecNumCloud control objectives across Azure and AWS environments. • Provide L3 incident response and lead root cause analysis efforts for application-tier outages across the Northstar platform. • Support the automated compliance gate (11 controls: NAC, FIPS, STIG, IRSA, SAST, DAST, SBOM, Sign, Vuln, Audit, Auth) and ensure application services pass all gates. • Manage Kubernetes workload autoscaling (Karpenter, KEDA, HPA) and pod security policies (Kyverno) for production services.

🎯 Requirements

• 10+ years of experience in SRE, DevOps, or Cloud Engineering. • 5+ years of hands-on experience with Microsoft Azure including AKS, Storage, Monitor, and IAM/Entra ID. • 3+ years of hands-on experience with AWS (EKS, RDS/Aurora, IAM, S3, SQS, Cognito). • 3+ years of Terraform module development for cloud infrastructure provisioning. • 3+ years in CI/CD tooling (GitHub Actions, ArgoCD, or equivalent GitOps platforms). • Strong Kubernetes operations experience: cluster management, pod security, autoscaling, troubleshooting. • Experience with service mesh technologies (Istio, Envoy, or Linkerd) and workload identity (SPIFFE/SPIRE, IRSA, or workload identity federation). • Proficiency in scripting languages: Python, Bash, or PowerShell. Go or .NET experience a plus. • Experience implementing regional compliance controls (FedRAMP, SOC 2, Cyber Essentials+, or equivalent). • Understanding of Zero Trust principles, mTLS, service identity, and network segmentation. • Experience with observability stacks (Prometheus, Grafana, Loki, or equivalent) and distributed tracing. • Familiarity with container supply chain security (image signing, SBOM, vulnerability scanning) is a plus.

🏖️ Benefits

• Trust in your talent. • Growth opportunities. • Supportive environment. • Recognition of individual achievements. • Commitment to inclusion and diversity.

Apply Now

Similar Jobs

🔥 4 minutes ago

Harbor IT

51 - 200

🔒 Cybersecurity

☁️ SaaS

🏢 Enterprise

Senior Site Reliability Engineer responsible for managing Linux infrastructure and system reliability at Harbor Compliance. Design and execute infrastructure strategy supporting operational excellence in a compliance industry.

🔥 23 minutes ago

Coinbase

1001 - 5000

₿ Crypto

💸 Finance

💳 Fintech

Senior Site Reliability Engineer at Coinbase building and scaling identity and access management systems. Owns reliability and DevOps practices for IAM systems.

🔥 23 minutes ago

Coinbase

1001 - 5000

₿ Crypto

💸 Finance

💳 Fintech

Senior Site Reliability Engineer managing AI infrastructure at Coinbase. Driving automation, reliability, and observability in critical AI operations.

🔥 2 hours ago

Aya Healthcare

5001 - 10000

⚕️ Healthcare Insurance

🎯 Recruiter

Lead the SRE team at Aya Healthcare for enhancing product reliability and operational efficiency. Manage incident responses and AI-native operations for a top healthcare workforce solutions provider.

🔥 2 hours ago

Offchain Labs

11 - 50

₿ Crypto

🌐 Web 3

Site Reliability Engineer at Offchain leading a movement in blockchain scalability and security. Tackling real-world challenges and transforming interactions with decentralized applications.