Senior Site Reliability Engineer – Kubernetes

11 - 50 employees

Founded 2022

🎯 Recruiter

👥 HR Tech

🤝 B2B

Recruitment • HR Tech • B2B

Pragmatike is a remote IT recruitment and staffing company that sources, vets, and places international tech talent for businesses. They provide human-curated matching (not solely AI), technical assessments, and fast placements—claiming qualified specialists can be introduced within 48 hours—while handling onboarding, invoicing and international payroll. Pragmatike serves startups and enterprises with roles like developers, data engineers, mobile and game developers, product managers and specialists, operating across 60+ countries with a large vetted talent pool.

Senior Site Reliability Engineer – Kubernetes

Job not on LinkedIn

🔥 0 minutes ago

🇵🇹 Portugal – Remote

⏰ Full Time

🟠 Senior

⛑ DevOps & Site Reliability Engineer (SRE)

Ansible

Cloud

Distributed Systems

Grafana

Kubernetes

Linux

Node.js

OpenStack

Prometheus

Python

VMware

Apply Now

Find Similar Remote Jobs

📊 Check your resume score for this job

Improve your chances of getting an interview by checking your resume score before you apply.

Pragmatike

11 - 50 employees

Founded 2022

🎯 Recruiter

👥 HR Tech

🤝 B2B

Recruitment • HR Tech • B2B

📋 Description

• Operate and maintain Linux-based infrastructure (Debian/Ubuntu). • Deploy, manage, and scale Kubernetes clusters across bare-metal, virtualized, and on-prem environments. • Oversee full cluster lifecycle: upgrades, node pools, networking, storage, and security hardening. • Implement automation for provisioning and operations using Ansible, Bash/Python, and GitOps workflows. • Design and maintain networking architecture including VLANs, L2/L3 routing, VPNs, and multi-site connectivity. • Build automated deployment workflows (PXE boot, Preseed, cloud-init). • Deploy and maintain observability stacks (Prometheus/Grafana, Loki, ELK, Graylog). • Lead incident response and escalation activities across the platform. • Improve system availability and reduce latency at all levels. • Define and implement SLOs/SLIs at multiple infrastructure levels (physical network/hardware, platform virtualization, software services). • Optimize alerting and monitoring pipelines to provide actionable insights. • Establish and maintain on-call schedules to ensure coverage across timezones. • Develop Standard Operating Procedures (SOPs) for repeatable operations and maintenance tasks. • Coordinate physical maintenance for Policlouds (periodic maintenance, hardware issues, DC-Ops). • Manage virtualization and orchestration layers (OpenStack, Proxmox, VMware). • Help develop and maintain overall architecture across all products. • Plan resources for future initiatives, accounting for demand and growth projections. • Work with development teams to improve overall quality and optimize resource utilization. • Collaborate with cross-functional stakeholders (Hivenet, Policloud, Customer Success teams).

🎯 Requirements

• Expert-level, hands-on experience operating Kubernetes in production environments. • Strong network engineering skills (VLANs, L2/L3 routing, VPNs, multi-site connectivity) - this is essential for the role. • Strong proficiency with Linux systems administration (Debian/Ubuntu). • Solid understanding of networking fundamentals and ability to design complex network architectures. • Experience building and maintaining automation workflows (Ansible, Bash/Python, Git-based). • Experience with observability stacks such as Prometheus, Grafana, ELK, Loki, or Graylog. • Background with virtualization technologies (OpenStack, Proxmox, VMware). • Experience with bare-metal provisioning and MAAS (Metal as a Service). • Strong understanding of distributed systems and container orchestration. • Process-oriented mindset with ability to develop SOPs and operational procedures from scratch. • Experience with incident response, escalation procedures, and on-call rotations. • Ability to work autonomously in a fast-paced, engineering-driven environment. • Strong technical skills combined with alignment to team values.

🏖️ Benefits

• 100% remote work with flexible hours • High-impact role with autonomy and ownership • Collaborative and international engineering team • Cutting-edge tech stack with strong focus on reliability and automation.

Apply Now

Similar Jobs

Senior DevOps Engineer – AWS, Kubernetes

🕒 June 19

HumanIT Digital Consulting

51 - 200

🤝 B2B

🎯 Recruiter

Senior DevOps Engineer working with AWS and Kubernetes at a global insurtech company. Building cutting-edge online platforms in a fast-paced, collaborative environment.

🇵🇹 Portugal – Remote

💵 €2k - €2.5k / month

⏰ Full Time

🟠 Senior

⛑ DevOps & Site Reliability Engineer (SRE)

Apache

AWS

Cassandra

Cloud

Docker

ElasticSearch

Google Cloud Platform

Grafana

Java

JavaScript

Jenkins

Kubernetes

Linux

Microservices

NGINX

Node.js

NoSQL

PHP

Prometheus

Python

RabbitMQ

Redis

Ruby

Splunk

SQL

Site Reliability Engineer

🕒 June 17

Intapp

1001 - 5000

☁️ SaaS

💸 Finance

🤖 Artificial Intelligence

Site Reliability Engineer handling Azure-based cloud infrastructure for Intapp's AI platform. Utilizing AI tools to enhance reliability and automate processes in a collaborative environment.

🇵🇹 Portugal – Remote

💰 Private Equity Round on 2017-04

⏰ Full Time

🟡 Mid-level

🟠 Senior

⛑ DevOps & Site Reliability Engineer (SRE)

Azure

Cloud

Postgres

SQL

DevOps Engineer

🕒 June 11

Wirtek Romania

51 - 200

🤝 B2B

⚡ Energy

🏢 Enterprise

DevOps Engineer streamlining, standardizing, and continuously improving software delivery lifecycle. Collaborating across teams to enhance engineering efficiency in Porto.

🇵🇹 Portugal – Remote

⏰ Full Time

🟡 Mid-level

🟠 Senior

⛑ DevOps & Site Reliability Engineer (SRE)

Angular

Docker

Grafana

Kubernetes

Prometheus

React

.NET

Cloud Operations Engineer

🕒 June 3

Unit4

1001 - 5000

🏢 Enterprise

☁️ SaaS

🤖 Artificial Intelligence

Cloud Operations Engineer at Unit4 helping solve customer processing issues and building better solutions. Learning skills in Azure and DevOps while professional development from experienced team members.

🇵🇹 Portugal – Remote

💵 €30.5k - €35.1k / year

⏰ Full Time

🟡 Mid-level

🟠 Senior

⛑ DevOps & Site Reliability Engineer (SRE)

Azure

SMTP

SQL

Cloud Operations Engineer

🕒 May 23

Unit4

1001 - 5000

🏢 Enterprise

☁️ SaaS

🤖 Artificial Intelligence

Cloud Operations Engineer for Unit4 addressing customer processing issues with innovative solutions. Working in a supportive environment with opportunities to learn Azure, DevOps, and troubleshoot issues.

🇵🇹 Portugal – Remote

⏰ Full Time

🟡 Mid-level

🟠 Senior

⛑ DevOps & Site Reliability Engineer (SRE)

Azure

SMTP

SQL