Site Reliability Engineer

Job not on LinkedIn

April 4

Apply Now
Logo of mywork

mywork

Passionate about people and technology, aligning talented professionals with great opportunities and clients.

1 - 10 employees

Founded 2019

📋 Description

• Collaborate with a globally distributed SRE team to ensure the reliability and scalability of production environments. • Maintain the infrastructure and services that support cloud-based applications, ensuring uptime and optimal performance. • Oversee Kubernetes clusters and ensure their health and reliability. • Automate deployment and configuration processes using tools like Terraform and Ansible. • Develop and implement monitoring systems with tools such as CloudWatch, Grafana, and Prometheus to identify potential issues before they impact users. • Identify, troubleshoot, and resolve defects in production systems, making necessary code changes to improve stability and performance. • Participate in an on-call rotation to provide 24/7 support for critical production systems.

🎯 Requirements

• At least 5 years of experience as a Site Reliability Engineer or similar role. • Strong expertise in AWS services, including EKS, S3, RDS, Lambda, EC2, and others. • Proficiency in Kubernetes management and troubleshooting. • Knowledge of / able to code Java, Python, GoLang or Kafka. • Hands-on experience with automation tools such as Terraform and Ansible. • Familiarity with monitoring tools like CloudWatch, ELK Stack, Grafana, and Prometheus. • Comprehensive knowledge of networking and security principles, including firewalls, VPNs, and SSL/TLS. • Excellent troubleshooting and problem-solving abilities. • Strong written and verbal communication skills.

🏖️ Benefits

• Competitive salary between £55,000 - £75,000 • Company Pension • Generous annual leave allowance • Opportunities to work in a fully remote and collaborative global team. • Comprehensive health and wellness benefits. • Ongoing professional development and training opportunities.

Apply Now

Similar Jobs

February 12

Oversee SRE operations for a prominent B2B diamond marketplace, ensuring reliability and scalability.

AWS

Azure

Cloud

Google Cloud Platform

Grafana

Prometheus

Python

Terraform

Go

February 12

Responsible for cloud development and operations. Must be SC cleared UK national with relevant experience.

Ansible

AWS

Chef

Cloud

DNS

Docker

EC2

Grafana

Jenkins

Kubernetes

Prometheus

Puppet

TCP/IP

Terraform

February 11

Join Prima as an SRE, ensuring reliability and performance while supporting software teams in cloud operations.

AWS

Cloud

DNS

Elixir

Kafka

Kubernetes

Microservices

Postgres

Python

RabbitMQ

Redis

Rust

Terraform

February 8

Join a telecoms software company as a Site Reliability Engineer ensuring system performance and reliability.

Cloud

Grafana

Kubernetes

Prometheus

Go

Built by Lior Neu-ner. I'd love to hear your feedback — Get in touch via DM or support@remoterocketship.com