Intermediate Site Reliability Engineer, Database Operations

October 9

Apply Now
Logo of GitLab

GitLab

Artificial Intelligence ‱ Enterprise ‱ SaaS

GitLab is the most comprehensive AI-powered DevSecOps platform, offering tools for automated software delivery, security, and compliance throughout the software development lifecycle. It provides solutions across areas such as AI-assisted development, continuous integration/continuous deployment (CI/CD), source code management, and vulnerability management. GitLab aims to simplify and accelerate software delivery by uniting development, security, and operations on a unified platform. It is particularly recognized for its AI code assistants and has been named a leader in the Gartner Magic Quadrantℱ for DevOps Platforms, making it a preferred choice for many enterprises.

1001 - 5000 employees

Founded 2014

đŸ€– Artificial Intelligence

🏱 Enterprise

☁ SaaS

💰 Secondary Market on 2020-11

📋 Description

‱ Automating every operational task is a core requirement for this role. For example, package updates, configuration changes across all environments, creating tools for automatic provisioning of user facing services, etc. ‱ Responding to platform emergencies, alerts, and escalations from Customer Support. ‱ Ensure systems exist to manage software life-cycles (e.g. Operating Systems) with a minimum of manual effort. ‱ Develop a fully automated multi-environment observability stack based on the existing SaaS system, and extend it to predict capacity needs based on the usage patterns. ‱ Plan for new service roll-outs, expansion and capacity management of existing services, and work with users to optimize their resource consumption. ‱ Work on database reliability and performance aspects for GitLab.com from within the SRE team as well as work on shipping solutions with the product. ‱ Analyze solutions and implement best practices for our PostgreSQL database clusters and its components. ‱ Work on observability of relevant database metrics and make sure we reach our database objectives. ‱ Work with peer SREs to roll out changes to our production environment and help mitigate database-related production incidents. ‱ OnCall support on rotation with the team. ‱ Provide database expertise to engineering teams (for example through reviews of database migrations, queries and performance optimizations). ‱ Work on automation of database infrastructure and help engineering succeed by providing self-service tools. ‱ Use the GitLab product to run GitLab.com as a first resort and improve the product as much as possible. ‱ Plan the growth of GitLab's database infrastructure. ‱ Design, build and maintain core database infrastructure components that allow GitLab to scale to support hundreds of thousands of concurrent users. ‱ Support and debug database production issues across services and levels of the stack. ‱ Make monitoring and alerting alert on symptoms and not on outages. ‱ Document every action so your learnings turn into repeatable actions and then into automation.

🎯 Requirements

‱ Have primary experience running PostgreSQL in high-growth, large production environments using both self-managed (VM, Kubernetes with modern PostgreSQL Operators) as well DBaaS services. ‱ Have hands-on experience using data from PostgreSQL internals to design, build and troubleshoot systems. ‱ Have primary experience with infrastructure automation, orchestration and configuration management (Chef, Ansible, Puppet, Terraform) ‱ Have solid understanding of SQL and PL/pgSQL ‱ Significant experience working in a Large SaaS distributed Systems production environment ‱ Share our values, and work in accordance with those values. ‱ Have excellent written and verbal English communication skills, with an urge to collaborate and communicate asynchronously. ‱ Have an urge to document all the things so you don't need to learn the same thing twice, and an urge for delivering quickly and iterating fast. ‱ Have a proactive, go-for-it attitude. When you see something broken, you can't help but fix it ‱ Solid data modeling and data structure design skills ‱ Bonus: Solid programming skills as a (former) backend engineer - Preferably with Ruby and/or Go. ‱ Bonus: Experience with Clickhouse, or other modern OLAP database.

đŸ–ïž Benefits

‱ GitLab is proud to be an equal opportunity workplace ‱ GitLab’s policies and practices are based solely on merit

Apply Now

Similar Jobs

September 3

Senior Ruby/DevOps engineer to improve infrastructure, migrate off Heroku, and scale Rewardful's Rails affiliate platform.

Cloud

Docker

Heroku

JavaScript

Kubernetes

Postgres

Redis

RSpec

Ruby

Ruby on Rails

TypeScript

April 14

Playson seeks a Senior Site Reliability Engineer to manage its high-end iGaming platform while enhancing infrastructure health.

AWS

Cloud

Docker

ElasticSearch

Flux

Grafana

Kubernetes

Logstash

Node.js

Prometheus

Python

Terraform

Go

April 14

Join Playson as a Site Reliability Engineer, enhancing a micro-service platform for iGaming.

AWS

Cloud

Docker

ElasticSearch

Flux

Grafana

Kubernetes

Logstash

Node.js

Prometheus

Python

Terraform

Go

January 10

x.labs

11 - 50

Join xLabs as a Site Reliability Engineer to build core infrastructure for decentralized applications across major blockchains.

Distributed Systems

Kubernetes

Built by Lior Neu-ner. I'd love to hear your feedback — Get in touch via DM or support@remoterocketship.com