Reliability Engineer

Job not on LinkedIn

🔥 4 minutes ago

Apply Now
Find Similar Remote Jobs

📊 Check your resume score for this job

Improve your chances of getting an interview by checking your resume score before you apply.

Logo of General Dynamics Information Technology

General Dynamics Information Technology

10,000+ employees

Founded 1954

🔒 Cybersecurity

🤖 Artificial Intelligence

Defense • Cybersecurity • Artificial Intelligence

General Dynamics Information Technology is a company at the forefront of technological innovation, offering a wide range of services including consulting, digital modernization, and application services. The company is heavily involved in implementing solutions related to artificial intelligence, cloud computing, cybersecurity, high-performance computing, and quantum technologies. GDIT is committed to supporting government and defense sectors, providing mission-critical services such as logistics and supply chain management, intelligence, and homeland security. The company also focuses on diverse and inclusive hiring practices and actively promotes employee well-being. Through its digital accelerator solutions and pioneering use of emerging technologies, GDIT aims to propel agencies' missions forward and address complex technological challenges.

📋 Description

• Design, build, and maintain scalable, reliable infrastructure and services that support Hosting Services, Site Reliability Engineering, virtualization, and data center operations for a federal customer. • Collaborate closely with software developers, infrastructure engineers, and IT operations teams to plan and execute deployments, improve system architectures, and enhance service reliability. • Use automation and scripting (e.g., Python, Bash) to reduce manual work, streamline deployments, and improve consistency across environments. • Monitor system performance, availability, and capacity using modern tooling; proactively identify issues and participate in on-call support to restore services quickly when incidents occur. • Implement and support Continuous Integration/Continuous Delivery (CI/CD) pipelines using tools such as Jenkins, Git, and Terraform to enable reliable and repeatable releases. • Leverage containerization and orchestration technologies such as Docker and Kubernetes to build resilient, scalable platforms. • Work with databases (e.g., SQL, MySQL) and application stacks (e.g., Java-based services) to ensure data integrity, performance, and fault tolerance. • Partner with cross-functional teams, using Jira and other collaboration tools, to track work, communicate status, and drive continuous improvement in reliability and operational excellence. • Contribute to a culture of teamwork and collaboration by sharing knowledge, participating in post-incident reviews, and helping define best practices for reliability engineering.

🎯 Requirements

• 5+ years of related experience in Site Reliability Engineering, DevOps, systems engineering, or software engineering roles • Experience with deployments and production operations in Linux-based environments • Proficiency with scripting/coding (e.g., Python, Java, shell scripting) • Hands-on experience with AWS or other cloud platforms • Strong Linux administration skills • Experience with SQL/MySQL and database concepts • Containerization and orchestration (Docker, Kubernetes) • CI/CD and automation tools (Jenkins, Git, Terraform, Ansible) • Experience with Infrastructure as Code ( IaC ) and automated configuration management • Must have a BA/BS or equivalent

🏖️ Benefits

• Comprehensive benefits and wellness packages • 401K with company match • Paid time off • Full flex work weeks where possible • Variety of paid time off plans including vacation, sick and personal time, holidays, paid parental, military, bereavement and jury duty leave • 15 days of paid leave per calendar year • Additional 10 paid holidays per year • Short and long-term disability benefits • Life, accidental death and dismemberment, personal accident, critical illness and business travel and accident insurance

Apply Now

Similar Jobs

🔥 56 minutes ago

Syniti

1001 - 5000

🤝 B2B

🏢 Enterprise

Senior SRE designing and implementing automation for Syniti's cloud-hosted SaaS platform. Collaborating with multiple teams to ensure compliance and scalability across Azure and AWS environments.

AWS

Azure

Cloud

Grafana

Kubernetes

Postgres

Prometheus

Python

Redis

Terraform

Go

.NET

🔥 1 hour ago

Harbor IT

51 - 200

🔒 Cybersecurity

☁️ SaaS

🏢 Enterprise

Senior Site Reliability Engineer responsible for managing Linux infrastructure and system reliability at Harbor Compliance. Design and execute infrastructure strategy supporting operational excellence in a compliance industry.

Ansible

Kubernetes

Linux

MySQL

Terraform

🔥 1 hour ago

Coinbase

1001 - 5000

₿ Crypto

💸 Finance

💳 Fintech

Senior Site Reliability Engineer managing AI infrastructure at Coinbase. Driving automation, reliability, and observability in critical AI operations.

AWS

Cloud

Docker

Kubernetes

Python

Ruby

Go

🔥 1 hour ago

Coinbase

1001 - 5000

₿ Crypto

💸 Finance

💳 Fintech

Senior Site Reliability Engineer at Coinbase building and scaling identity and access management systems. Owns reliability and DevOps practices for IAM systems.

AWS

Azure

Cloud

Google Cloud Platform

Java

Python

Ruby

Terraform

Go

🔥 3 hours ago

Aya Healthcare

5001 - 10000

⚕️ Healthcare Insurance

🎯 Recruiter

Lead the SRE team at Aya Healthcare for enhancing product reliability and operational efficiency. Manage incident responses and AI-native operations for a top healthcare workforce solutions provider.

AWS

Azure

Google Cloud Platform