Senior Site Reliability Engineer

Job not on LinkedIn

🕒 April 30

Apply Now
Find Similar Remote Jobs

📊 Check your resume score for this job

Improve your chances of getting an interview by checking your resume score before you apply.

Logo of Civica US

Civica US

51 - 200 employees

Founded 2023

🏛️ Government

☁️ SaaS

📚 Education

Government • SaaS • Education

Civica US is a global leader in public sector software, providing solutions that help deliver critical services for citizens around the world. The company's software is utilized by over 5,000 public bodies globally, supporting more than 100 million citizens. Civica's products are designed to serve government departments, justice and court systems, education institutions, and health and care providers. Their capabilities include cloud and digital services, data analytics, financial management, people management, and governance risk compliance. As a GovTech company, Civica US specializes in creating software that improves outcomes and efficiencies across various public sector domains, aiding in administrative processes, data management, and citizen engagement.

📋 Description

• Designing and implementing for scale & resilience: Architect, implement and continuously improve our existing Data Center and Cloud environments on AWS, Azure, and VMware, ensuring they meet our SLAs and adapt dynamically to demand working alongside the Platform teams providing PaaS/IaaS. • Driving automation: Build and evolve infrastructure as code (Terraform, etc.) and CI/CD pipelines (GitHub Actions, etc.) to ship new features safely and at speed. • Defining and measuring reliability: Partner with teams to set up meaningful SLIs/SLOs, implement real-time observability (Datadog, Prometheus, Grafana, ...) and proactively identify risks before it impacts our users. • Leading incident response: Own the on-call rota, coach teams through blameless post-mortems, and embed a culture of continuous improvement so outages become learning opportunities. • Mentoring & evangelism: Share your deep expertise by pairing with engineers, running brown-bag sessions on reliability best practices, and helping raise the bar across our global engineering organisation. • Securing our stack: Collaborate with our Security team and include security controls into CI/CD, runtime environments and disaster-recovery plans; so, our customers and citizens are always protected.

🎯 Requirements

• Demonstrable experience in a production SRE, DevOps or infrastructure role, ideally within a SaaS or large-scale web environment • Expert in at least one public cloud (AWS, Azure, or GCP) and comfortable designing hybrid migrations from on-prem to cloud • Strong coding/scripting and troubleshooting skills (on either of Go, .NET, Java, Python, etc.) and a passion for building reusable tested libraries and tooling • Proven track record with IaC tools (Terraform, CloudFormation, or similar) and container orchestration (Kubernetes, ECS, AKS, OpenShift) • Proven track record with virtual machine orchestration / provisioning and resiliency strategies (Kubevirt, packer, ansible) • Deep understanding of monitoring, logging, and tracing frameworks (Prometheus/Grafana, ELK/Opensearch, Jaeger, etc.) • Excellent communicator who thrives in cross-functional teams, with passion for translating complex technical issues into clear, actionable plans.

🏖️ Benefits

• 25 Days Annual Leave + bank holidays – plus the option to buy up to 10 extra days! • Days of Difference – Up to 3 extra days off for volunteering. • Pension Contributions – 5% employer match to support your future. • Income Protection – Up to 75% salary cover for long-term illness. • Life Assurance – 4x salary tax-free lump sum. • Critical Illness Cover – £25,000 lump sum (extendable to dependents). • Private Medical Insurance – Fast access to private healthcare. • Health Cash Plan – Claim back physio, therapies & more. • Dental Insurance – Cover for routine & emergency care. • Electric Vehicle (EV) Scheme – A wide range of electric & hybrid vehicles. • Affinity Groups – Join employee-led communities. • Bounty Bonus – Refer a friend & get rewarded.

Apply Now

Similar Jobs

🕒 April 25

Atos

10,000+ employees

🔒 Cybersecurity

DevOps Engineer supporting cloud transformation and application portfolios for clients. Collaborating with stakeholders and developers to improve technology and infrastructure in a remote-first environment.

AWS

Azure

Cloud

Cyber Security

Docker

Kubernetes

🕒 April 24

GitLab

1001 - 5000

🤖 Artificial Intelligence

🏢 Enterprise

☁️ SaaS

Cloud Cost Utilization SRE responsible for making cloud spending actionable. Collaborating with Finance and Engineering at GitLab to optimize resource usage.

Ansible

AWS

Cloud

Google Cloud Platform

Grafana

Prometheus

Terraform

🕒 April 24

Lyrebird Health

11 - 50

⚕️ Healthcare Insurance

☁️ SaaS

🤖 Artificial Intelligence

Senior SRE at Lyrebird tasked with managing the reliability and scalability of production systems. Build infrastructure and deployment patterns to support AI-powered healthcare tools.

AWS

Cloud

Distributed Systems

Docker

EC2

Kubernetes

🕒 April 22

NICE

5001 - 10000

☁️ SaaS

🤖 Artificial Intelligence

📡 Telecommunications

SRE - NOC role focuses on service reliability, incident response, and operational automation. Precision in dealing with operational toil through engineering practices for global operations at NICE.

Ansible

AWS

Cloud

DNS

Docker

Grafana

Kubernetes

Linux

Prometheus

Python

Splunk

TCP/IP

Terraform

Go

🕒 April 21

Ripjar

51 - 200

💸 Finance

📋 Compliance

🤖 Artificial Intelligence

DevOps Engineer ensuring reliability and security of infrastructure for software combating financial crime at Ripjar. Focus on continuous improvement and automation within a remote-first team.

Ansible

AWS

Azure

Cloud

Docker

JavaScript

Kubernetes

Linux

Prometheus

Python

Terraform