Senior Site Reliability Engineer

Job not on LinkedIn

September 9

Apply Now
Logo of CodingChiefs: Dedicated Remote Developers

CodingChiefs: Dedicated Remote Developers

B2B • Recruitment • SaaS

CodingChiefs is a company that provides dedicated remote developers for various projects and businesses. They specialize in offering skilled and expert developers who can seamlessly integrate with existing teams to deliver high-quality software solutions. CodingChiefs ensures that their developers are well-matched to the client's specific needs and technological requirements, facilitating efficiency and effective communication.

📋 Description

• Design, build, and maintain core infrastructure using Infrastructure as Code (IaC) principles • Evolve CI/CD pipelines to ensure safe, rapid, and reliable releases • Identify and address performance bottlenecks, single points of failure, and scalability limits • Define and monitor Service Level Objectives (SLOs) and Service Level Indicators (SLIs) • Implement and manage monitoring, logging, and alerting systems (e.g., Prometheus, Grafana, ELK Stack) • Participate in on-call rotation; lead incident response and blameless post-mortems • Collaborate with software engineering teams to improve resilience, observability, and developer experience • Implement and maintain security best practices across cloud infrastructure

🎯 Requirements

• 5+ years of hands-on experience with a major cloud provider, preferably AWS (EC2, S3, RDS, VPC, IAM, etc.) • Deep proficiency with tools like Terraform or CloudFormation to manage infrastructure declaratively • Strong experience with Docker and container orchestration systems like Kubernetes (EKS) or ECS • Proven ability to build, optimize, and manage CI/CD pipelines using tools like GitLab CI, Jenkins, or CircleCI • Hands-on experience with modern monitoring and logging tools (e.g., Prometheus, Grafana, Loki, Alertmanager, ELK Stack) • Proficiency in at least one programming language, such as Go, Python, or Bash, for automation and tooling • Excellent written and verbal communication skills, with a proven ability to work effectively and asynchronously in a distributed team environment • Preferred: Experience in the payments or FinTech industry • Preferred: Familiarity with service mesh technologies like Istio or Linkerd • Preferred: Experience with database administration (e.g., PostgreSQL, MySQL) • Preferred: Knowledge of networking, security principles, and compliance standards (e.g., PCIDSS)

Apply Now

Similar Jobs

September 9

Senior Site Reliability Engineer for Series A fintech payments company. Automate infrastructure, improve observability, and lead incident response for a high-traffic payment platform.

AWS

Cloud

Docker

EC2

Grafana

Java

Jenkins

Kubernetes

MySQL

Postgres

Prometheus

Python

Terraform

Go

August 29

DevOps Engineer at AGSI; design, deploy and optimize AWS cloud infrastructure for scalable, reliable systems. Focus on reliability, security, and cost optimization.

Ansible

AWS

Cloud

EC2

Grafana

Prometheus

Python

Ray

Terraform

August 28

Senior DevOps Engineer building scalable, secure AWS infrastructure and CI/CD for AGSI supporting Arch divisions. Focus on reliability, observability, incident response, and cost optimization.

Ansible

AWS

Cloud

EC2

Grafana

Prometheus

Python

Ray

Terraform

July 30

Join an award-winning IT team delivering innovative solutions for Canada’s top tech corporation.

Azure

Cloud

Docker

Linux

OpenShift

Go

Built by Lior Neu-ner. I'd love to hear your feedback — Get in touch via DM or support@remoterocketship.com