Site Reliability Engineer – APAC

Job not on LinkedIn

🔥 0 minutes ago

Apply Now
Find Similar Remote Jobs

📊 Check your resume score for this job

Improve your chances of getting an interview by checking your resume score before you apply.

Logo of pod network

pod network

1 - 10 employees

🌐 Web 3

Blockchain • Web 3 • Software

pod network is a pioneering platform aiming to simplify and accelerate Web3 development. By introducing a unique stake-based programmable layer one, pod focuses on delivering optimal latency and seamless user experience for decentralized applications. The company's ethos centers on providing an infrastructure that enables developers to build real products with a sleek user interface, without the complexities of traditional blockchain technology. With backing from notable investors and a commitment to community engagement, pod network seeks to onboard the next billion users to a more democratic internet.

📋 Description

• Monitor the health and performance of the platform • Respond to production incidents and drive them through to resolution • Investigate failures, identify root causes, and coordinate fixes • Ensure issues are detected, understood, and addressed quickly • Identify recurring operational pain points and eliminate them • Improve software, deployment processes, and operational workflows • Participate in incident reviews and help drive preventative improvements • Contribute reliability-focused changes directly to production systems • Design and maintain dashboards, metrics, alerting, and monitoring systems • Improve signal quality while reducing alert fatigue • Build automation and internal tools that make the platform easier to operate • Help establish reliability best practices across the engineering organization

🎯 Requirements

• Strong experience with Linux and cloud infrastructure • Experience operating and supporting production systems • Experience with Docker and containerized environments • Experience with observability and incident-management tools such as Grafana, Prometheus, PagerDuty, or similar • Ability to automate workflows using Rust, Python, Bash, or similar languages • Strong troubleshooting and debugging skills • A high degree of ownership and the ability to make sound decisions independently • Nice to Have: Experience with distributed systems, high-availability, low-latency services, CI/CD systems, deployment automation, designing secure operational workflows and access controls

🏖️ Benefits

• Competitive compensation (~$100k USD/year) • Meaningful token/equity allocation • Real ownership and responsibility from day one • Work from wherever you are within the target timezone range (UTC+7 to UTC+1) • Occasional travel to Europe and elsewhere for team meetups

Apply Now

Similar Jobs

🕒 June 11

Unit4

1001 - 5000

🏢 Enterprise

☁️ SaaS

🤖 Artificial Intelligence

Cloud Operations Engineer at Unit4 solving customer business processing issues and building better solutions with skills in Azure, DevOps, and troubleshooting.

Azure

Cloud

SMTP

SQL

🕒 April 24

LineTen

51 - 200

🛍️ eCommerce

☁️ SaaS

🚗 Transport

Site Reliability Engineer joining LineTen to ensure global coverage of our products. Responsible for engineering support and development experience using Docker and Kubernetes.

Cloud

Docker

Kubernetes

🕒 April 22

Pave Bank

51 - 200

Senior Site Reliability Engineer at Pave Bank, focused on reliable and scalable banking platforms, automating operations and collaborating with various teams.

🇲🇾 Malaysia – Remote

🔥 Funding within the last year

💰 $39M Series A - Pave Bank on 2025-10

⏰ Full Time

🟠 Senior

⛑ DevOps & Site Reliability Engineer (SRE)

Cloud

Distributed Systems

Docker

Google Cloud Platform

Grafana

Kubernetes

Microservices

Prometheus

Python

Terraform

Go

🕒 April 15

Arize AI

51 - 200

🤖 Artificial Intelligence

☁️ SaaS

🏢 Enterprise

Senior DevOps Engineer optimizing infrastructure for SaaS and on-prem AI services at Arize. Collaborates with customers and product teams to enhance performance and reliability.

AWS

Azure

Cloud

Google Cloud Platform

Kubernetes

🕒 April 2

NexTbil Tech

201 - 500

🎮 Gaming

🌐 Web 3

🤖 Artificial Intelligence

DevOps Engineer designing and maintaining CI/CD infrastructure for AWS and Azure cloud-native workloads at Provido Global. Collaborate with engineering teams to support automated delivery pipelines.

AWS

Azure

Cloud

Grafana

Kubernetes

Prometheus

Terraform