Senior Site Reliability Engineer

🔥 0 minutes ago

Apply Now
Find Similar Remote Jobs

📊 Check your resume score for this job

Improve your chances of getting an interview by checking your resume score before you apply.

Logo of ARA

ARA

1001 - 5000 employees

🚀 Aerospace

🤖 Artificial Intelligence

🔬 Science

💰 $12M Grant on 2023-04

Aerospace • Artificial Intelligence • Science

ARA is a 100% employee-owned applied research and engineering company that provides technically rigorous solutions across national security, infrastructure, health, and energy domains. The firm specializes in C4ISR and space technologies, unmanned systems and autonomy, sensors and advanced security systems, AR/VR and synthetic environments, AI/ML, electromagnetics and explosive testing, biodefense and physiological modeling, and disaster risk and infrastructure engineering. ARA delivers research, engineering, prototyping, and mission-focused technical services to government and commercial customers.

📋 Description

• Partner with software developers, platform engineers, and IT staff to improve system design, operability, deployment safety, and production support readiness. • Define and maintain operational standards, runbooks, support procedures, escalation paths, and service-level objectives. • Evaluate system architecture and changes to ensure they balance functional requirements, service quality, reliability, security, and compliance needs. • Drive continuous improvement in platform stability, maintenance, and availability. • Provide advanced technical support and troubleshooting for complex platform and service issues affecting internal users and stakeholders.

🎯 Requirements

• 8+ years of experience in Site Reliability Engineering, DevOps, Platform Engineering, Systems Engineering, or related infrastructure roles supporting production services. • Strong experience with Linux systems administration and troubleshooting in enterprise environments. • Strong experience operating and maintaining on-prem Kubernetes platforms and all related components including CRI, CNI, and CSI plugins. • Experience deploying and maintaining applications on Kubernetes using Helm, Kustomize, and similar tooling. • Experience supporting DevOps tooling such as GitLab, Artifactory, Jira, Confluence. • Experience with GitOps tools such as FluxCD or ArgoCD. • Proficiency scripting with at least one of Python, Go, or Bash. • Strong experience designing, maintaining, and maturing observability tooling including monitoring, dashboards, logging and tracing, and supporting SLOs. • Strong understanding of reliability engineering concepts: Service health indicators, High availability design, failure reduction, and testing, Operational readiness practices, including developing documentation, runbooks, and architectural descriptions, Incident response, root cause analysis, remediation/recovery. • Ability to obtain a security clearance, which includes U.S. citizenship.

🏖️ Benefits

• Equal Opportunity Employer/Protected Veterans/Individuals with Disabilities

Apply Now

Similar Jobs

🔥 3 hours ago

Sentara Health

10,000+ employees

⚕️ Healthcare Insurance

DevOps Engineer developing and maintaining Azure cloud infrastructure using Terraform for Sentara Health. Collaborating with teams and applying DevOps best practices in a fully remote role.

🔥 3 hours ago

CVS Health

10,000+ employees

⚕️ Healthcare Insurance

🛒 Retail

🧘 Wellness

DevOps Engineer supporting and scaling our AWS and Databricks-based data platform. Focus on automation, CI/CD, and quick delivery of data pipelines and analytics.

🔥 3 hours ago

Akuity

11 - 50

🏢 Enterprise

☁️ SaaS

Senior SRE responsible for platform reliability at Akuity, optimizing Kubernetes and AWS performance. Collaborate with teams on incident response and improvements while maintaining critical SLAs.

🔥 7 hours ago

Your Software Supplier

51 - 200

🏪 Marketplace

🤝 B2B

☁️ SaaS

DevOps Engineer managing cloud infrastructure at Your Software Supplier. Leveraging Azure, DevOps, Kubernetes, and Docker for seamless deployment pipelines.

🇺🇸 United States – Remote

💵 $160k - $220k / year

💰 $50k Pre Seed Round - Your Software Supplier on 2019-08

⏰ Full Time

🟡 Mid-level

🟠 Senior

⛑ DevOps & Site Reliability Engineer (SRE)

🔥 9 hours ago

Akkadian Labs

51 - 200

☁️ SaaS

🏢 Enterprise

📡 Telecommunications

DevOps Engineer supporting scalable and secure infrastructure and DevOps processes at Akkadian Labs. Collaborating with development and product teams for reliable deployments and automation.