GitLab

Artificial Intelligence • Enterprise • SaaS

GitLab is the most comprehensive AI-powered DevSecOps platform, offering tools for automated software delivery, security, and compliance throughout the software development lifecycle. It provides solutions across areas such as AI-assisted development, continuous integration/continuous deployment (CI/CD), source code management, and vulnerability management. GitLab aims to simplify and accelerate software delivery by uniting development, security, and operations on a unified platform. It is particularly recognized for its AI code assistants and has been named a leader in the Gartner Magic Quadrant™ for DevOps Platforms, making it a preferred choice for many enterprises.

1001 - 5000 employees

Founded 2014

🤖 Artificial Intelligence

🏢 Enterprise

☁️ SaaS

💰 Secondary Market on 2020-11

Intermediate Site Reliability Engineer, Database Operations

October 9

🇪🇺 Europe – Remote

⏰ Full Time

🟡 Mid-level

🟠 Senior

⛑ DevOps & Site Reliability Engineer (SRE)

Ansible

Chef

Distributed Systems

Kubernetes

Postgres

Puppet

Ruby

SQL

Terraform

Apply Now

GitLab

Website LinkedIn All Job Openings

Artificial Intelligence • Enterprise • SaaS

1001 - 5000 employees

Founded 2014

🤖 Artificial Intelligence

🏢 Enterprise

☁️ SaaS

💰 Secondary Market on 2020-11

📋 Description

• Automating every operational task is a core requirement for this role. For example, package updates, configuration changes across all environments, creating tools for automatic provisioning of user facing services, etc. • Responding to platform emergencies, alerts, and escalations from Customer Support. • Ensure systems exist to manage software life-cycles (e.g. Operating Systems) with a minimum of manual effort. • Develop a fully automated multi-environment observability stack based on the existing SaaS system, and extend it to predict capacity needs based on the usage patterns. • Plan for new service roll-outs, expansion and capacity management of existing services, and work with users to optimize their resource consumption. • Work on database reliability and performance aspects for GitLab.com from within the SRE team as well as work on shipping solutions with the product. • Analyze solutions and implement best practices for our PostgreSQL database clusters and its components. • Work on observability of relevant database metrics and make sure we reach our database objectives. • Work with peer SREs to roll out changes to our production environment and help mitigate database-related production incidents. • OnCall support on rotation with the team. • Provide database expertise to engineering teams (for example through reviews of database migrations, queries and performance optimizations). • Work on automation of database infrastructure and help engineering succeed by providing self-service tools. • Use the GitLab product to run GitLab.com as a first resort and improve the product as much as possible. • Plan the growth of GitLab's database infrastructure. • Design, build and maintain core database infrastructure components that allow GitLab to scale to support hundreds of thousands of concurrent users. • Support and debug database production issues across services and levels of the stack. • Make monitoring and alerting alert on symptoms and not on outages. • Document every action so your learnings turn into repeatable actions and then into automation.

🎯 Requirements

• Have primary experience running PostgreSQL in high-growth, large production environments using both self-managed (VM, Kubernetes with modern PostgreSQL Operators) as well DBaaS services. • Have hands-on experience using data from PostgreSQL internals to design, build and troubleshoot systems. • Have primary experience with infrastructure automation, orchestration and configuration management (Chef, Ansible, Puppet, Terraform) • Have solid understanding of SQL and PL/pgSQL • Significant experience working in a Large SaaS distributed Systems production environment • Share our values, and work in accordance with those values. • Have excellent written and verbal English communication skills, with an urge to collaborate and communicate asynchronously. • Have an urge to document all the things so you don't need to learn the same thing twice, and an urge for delivering quickly and iterating fast. • Have a proactive, go-for-it attitude. When you see something broken, you can't help but fix it • Solid data modeling and data structure design skills • Bonus: Solid programming skills as a (former) backend engineer - Preferably with Ruby and/or Go. • Bonus: Experience with Clickhouse, or other modern OLAP database.