Senior Site Reliability Engineer

Job not on LinkedIn

55 minutes ago

Apply Now
Logo of MariaDB

MariaDB

Enterprise • Open Source • Database

MariaDB is a company that develops and provides an open-source cloud-native relational database solution. Known for its MariaDB Server and MariaDB Enterprise offerings, it delivers high availability, auto-failover capabilities, and supports both transactional and analytical workloads. MariaDB is favored for its flexibility, cost-effectiveness compared to proprietary databases, and support for various data models, including relational and JSON. It is widely used in Linux distributions as a replacement for MySQL and is popular among developers for its open-source innovation and ease of use.

201 - 500 employees

Founded 2009

🏢 Enterprise

📋 Description

• Design, implement, and evolve large-scale, cloud-native infrastructure supporting our global SaaS platform. • Lead reliability and scalability initiatives that span multiple teams and services, driving automation and resilience through infrastructure-as-code and GitOps practices. • Proactively identify and remediate systemic reliability issues, ensuring high service availability and performance across multi-cloud environments. • Collaborate with software and platform teams to integrate reliability principles, SLOs, and observability standards into every stage of the development lifecycle. • Act as a key technical leader during major incidents—coordinating response efforts, conducting root cause analysis, and implementing long-term corrective actions. • Contribute to continuous improvement by defining infrastructure patterns, refining CI/CD workflows, and mentoring other engineers in automation and reliability best practices.

🎯 Requirements

• At least 7 years of hands-on experience as an SRE, DevOps, or Infrastructure Engineer in production cloud environments. • Strong expertise with Kubernetes operations and ecosystem tooling in production-scale clusters. • Proven experience designing and maintaining multi-cloud infrastructure across Azure, AWS, or GCP. • Advanced proficiency with Terraform and Terragrunt, capable of designing modular, reusable, and secure IaC components. • Solid understanding of GitOps principles and deployment automation using ArgoCD or similar tools. • Deep experience with Linux systems administration, performance tuning, and troubleshooting. • Proficiency in one or more programming/scripting languages (Python, Bash, Go preferred). • Strong understanding of observability concepts and experience working with monitoring and alerting tools such as Prometheus, Grafana, and Thanos. • Experience participating in or leading on-call rotations, handling incident response, and conducting post-incident reviews.

🏖️ Benefits

• 25 days paid annual leave (plus holidays) • Health insurance • Life and disability insurance • Funds toward professional development resources • Parental leave • Massive degree of flexibility and freedom

Apply Now

Similar Jobs

18 hours ago

Site Reliability Engineer at Maneva ensuring reliable AI deployments for industrial environments. Involves operational support, monitoring, DevOps engineering, and documentation.

DNS

Docker

Grafana

IoT

Linux

Prometheus

Python

TCP/IP

3 days ago

DevOps/Cloud Analyst supporting Azure cloud infrastructure at Esri Canada. Responsible for deployment, maintenance, and operational support of Azure environments.

Azure

Cloud

ElasticSearch

SQL

4 days ago

Mid-Market Account Executive at Rewind handling full sales cycle for mid-market clients. Focusing on DevOps and IT leaders while driving new revenue and managing relationships.

November 27

Senior DevOps Engineer contributing to AI transformation projects with a focus on Google Cloud technologies. Collaborating with teams to implement DevOps best practices and innovative solutions.

Cloud

Google Cloud Platform

Java

Kubernetes

Python

Terraform

Go

November 26

Site Reliability Engineer at ScalePad ensuring infrastructure reliability, scalability, and developer experience. Focused on automating operational tasks and optimizing system performance in a dynamic environment.

AWS

Docker

Grafana

Java

Kubernetes

Prometheus

Python

Terraform

Go

Built by Lior Neu-ner. I'd love to hear your feedback — Get in touch via DM or support@remoterocketship.com