Senior Site Reliability Engineer, Kong Konnect

🕒 November 6, 2025

Apply Now
Find Similar Remote Jobs

📊 Check your resume score for this job

Improve your chances of getting an interview by checking your resume score before you apply.

Logo of Kong Inc.

Kong Inc.

201 - 500 employees

Founded 2017

🔌 API

☁️ SaaS

🏢 Enterprise

💰 $100M Series D on 2021-02

API • SaaS • Enterprise

Kong Inc. is a company that provides a comprehensive API platform designed to facilitate API management, AI integration, and developer productivity. It offers solutions like Kong Gateway, Kong Konnect, and a variety of other tools targeted at managing and optimizing the API lifecycle. Kong's platform supports multi-cloud environments and is built to deliver high performance and security. It is notably recognized by Gartner as a leader in API management and supports innovations across industries like financial services, healthcare, and technology. The company emphasizes flexibility, security, and speed, making it a favored choice for enterprises looking to enhance their digital services through APIs. Kong also supports a robust community of developers and provides extensive integrations and plugins to streamline API management and operations.

📋 Description

• Operate and scale Kong’s global SaaS platform (Konnect), ensuring reliability, availability, and performance across regions and clouds. • Build, automate, and maintain Kubernetes-based infrastructure and deployment workflows using Terraform/Terragrunt, Helm, and ArgoCD. • Design, maintain, and optimize multi-region data and caching layers — including PostgreSQL, Redis, ClickHouse, and Druid — for high availability and low latency. • Operate and improve Kong Gateway and Kong Mesh environments supporting hybrid and distributed architectures. • Develop and maintain CI/CD pipelines and GitOps workflows to automate service delivery and ensure consistent infrastructure changes. • Enhance observability and incident response readiness through systems like Datadog, Prometheus, Grafana, and Thanos, defining and tracking SLOs. • Collaborate closely with development and security teams to ensure smooth operation of SaaS services in compliance with reliability, security, and regulatory standards. • Participate in a global 24/7 on-call rotation and drive continuous improvement of operational playbooks and postmortem practices. • Lead and contribute to scaling initiatives that improve elasticity, reliability, and cost-efficiency across the SaaS platform.

🎯 Requirements

• BS in Computer Science or equivalent practical experience. • Demonstrated experience running and scaling SaaS platforms in production, ideally across multiple cloud providers. • Deep expertise in Kubernetes, including debugging cluster/networking issues and designing for fault tolerance and scalability. • Strong proficiency with Infrastructure as Code tools like Terraform or Terragrunt. • Experience with CI/CD pipelines and GitOps workflows (ArgoCD, Atlantis, Helm). • Proficiency in one or more programming languages (Go, Python, Bash) for automation and tooling. • Solid understanding of Linux/Unix systems, networking (DNS, TLS/SSL, HTTP), and distributed systems. • Familiarity with streaming systems like Kafka and observability platforms (Datadog, Prometheus, Grafana). • Experience working in a 24/7/365 production support environment.

🏖️ Benefits

• Health insurance • Professional development opportunities

Apply Now

Similar Jobs

🕒 October 14, 2025

Cerebras Systems

201 - 500

🤖 Artificial Intelligence

🔧 Hardware

⚕️ Healthcare Insurance

Sr. Deployment Engineer building and operating AI inference clusters for Cerebras Systems. Working with the world's largest AI chip to ensure scalable delivery of AI workloads.

AWS

Docker

Grafana

Kubernetes

Linux

Prometheus

Python

🕒 October 7, 2025

Atolio

11 - 50

🤖 Artificial Intelligence

🏢 Enterprise

☁️ SaaS

Deployment Engineer working with engineering and client success teams at Atolio. Ensure efficient deployment of enterprise search platform in various environments.

AWS

Azure

Cloud

Google Cloud Platform

Grafana

Kubernetes

Python

ServiceNow

Splunk

Terraform

Go

🕒 September 19, 2025

Veeva Systems

1001 - 5000

☁️ SaaS

⚕️ Healthcare Insurance

💊 Pharmaceuticals

DevOps Engineer building scalable cloud and CI/CD infrastructure for Veeva Systems' life sciences SaaS. Focus on IaC, automation, Kubernetes, Terraform, and reliability.

Ansible

AWS

Cloud

Distributed Systems

Docker

Java

Jenkins

Kubernetes

OpenShift

Python

Scala

Terraform

Go

🕒 September 16, 2025

Veeva Systems

1001 - 5000

☁️ SaaS

⚕️ Healthcare Insurance

💊 Pharmaceuticals

DevOps Engineer building scalable AWS infrastructure, CI/CD, and containerized deployments for Veeva's life sciences cloud; focuses on automation, reliability, and mentorship.

Ansible

AWS

Cloud

Distributed Systems

Docker

Java

Jenkins

Kubernetes

OpenShift

Python

Scala

SQL

Terraform

Go

🕒 September 10, 2025

Veeva Systems

1001 - 5000

☁️ SaaS

⚕️ Healthcare Insurance

💊 Pharmaceuticals

DevOps Engineer building scalable cloud infrastructure at Veeva Systems. Ensuring reliable, automated delivery of SaaS products for life sciences customers.

Ansible

AWS

Cloud

Distributed Systems

Docker

Java

Jenkins

Kubernetes

OpenShift

Python

Scala

SQL

Terraform

Go