Senior Site Reliability Engineer, SRE

🔥 0 minutes ago

Apply Now
Find Similar Remote Jobs

📊 Check your resume score for this job

Improve your chances of getting an interview by checking your resume score before you apply.

Logo of Sleek

Sleek

51 - 200 employees

Founded 2017

🏢 Enterprise

💸 Finance

☁️ SaaS

Enterprise • Finance • SaaS

Sleek is a digital platform that provides comprehensive business services aimed at small and medium enterprises (SMEs). The company specializes in company incorporation, accounting, payroll, corporate secretary services, and business account setup. Sleek enables businesses to streamline their administrative tasks and compliance processes through a user-friendly, digital platform. Operating globally, Sleek is committed to offering fast and efficient services with transparent pricing, making the process of setting up and managing a business hassle-free and entirely digital.

📋 Description

• At Sleek, we are on a mission to streamline operations and elevate customer experience through intelligent automation and modern engineering. We are seeking a Senior SRE Engineer who will be a key individual contributor responsible for architecting, building, and scaling Sleek’s next-generation infrastructure and AI-powered capabilities. • Partner closely with Product, Engineering, and AI teams to define our infrastructure strategy, design resilient cloud architectures, and ensure our platforms remain secure, scalable, and high-performing. • Play a central role in integrating AI systems into production environments, enabling efficient delivery, observability, and reliability across Sleek’s products and internal operations. • Ensure high-quality, secure, and scalable infrastructure capable of supporting modern applications and advanced AI workloads. • Implement robust automation across CI/CD, infrastructure provisioning, and operations to increase reliability and reduce manual overhead. • Integrate AI into operational workflows to improve efficiency, detect anomalies, and accelerate delivery. • Adhere to strong DevOps standards including reproducibility, testing, documentation, and operational excellence. • Maintain clear technical communication and cross-team alignment to enable predictable delivery and collaborative problem-solving. • Contribute mentorship and technical leadership that elevates platform engineering, DevOps maturity, and overall engineering quality across the organization.

🎯 Requirements

• 6+ years of progressive experience in Site Reliability Engineering (SRE) • 6+ years of strong, hands-on experience across multi-cloud environments such as AWS, GCP, Azure including expertise in networking, compute, storage, security, and cost optimization. • 6+ years of deep expertise in containerization and orchestration (e.g., Kubernetes, EKS, ECS) • 6+ years of extensive experience with Infrastructure as Code (IaC) (e.g., Terraform, Pulumi, CloudFormation). • System Reliability: Proven ability to design, build, and operate highly reliable, scalable production systems utilizing advanced Zero-Downtime Deployment Patterns (e.g., Blue/Green, Canary, progressive delivery). • Modern Delivery & Tooling: Expertise in modernizing deployments via GitOps practices (e.g., ArgoCD, Flux) and building Self-Service Developer Platforms that enable engineering efficiency (e.g., environment automation, internal tooling). • Networking & Edge Routing: Experience implementing and managing Multi-Cloud API Gateways and Edge Routing solutions (e.g., Kong, Traefik, Cloudflare, multi-cluster ingress). • Security & Hardening: Strong background in platform security, including secrets management, Identity and Access Control (IAM), and Runtime/Security Hardening with tools like Falco/eBPF and WAFs. • Observability: Solid understanding and practical experience with modern observability stacks (e.g., Prometheus, OpenTelemetry, OpenSearch, ELK, CloudWatch). • AI/ML Infrastructure: Experience supporting or deploying AI/ML workloads (e.g., model inference, vector databases, GPU workloads), or strong familiarity with the infrastructure requirements for these systems. • Communication: Excellent communication and collaboration skills with a proven ability to describe complex infrastructure decisions clearly and a background in driving improvements in engineering practices. • Development Expertise: Familiarity with modern programming languages like Node.js, NestJS, and Python is highly desirable for extending DevOps capabilities or integrating tooling.

🏖️ Benefits

• Humility and kindness: Humility is a core attribute we hire for, which means we have a culture of not taking ourselves too seriously and being able to laugh. Kindness is also incredibly important. We are committed to creating and nurturing a diverse and inclusive environment. • Flexibility: The role will be fully remote, with work from home five days a week. If you need to start early or start late to cater to your family or other needs, we don’t mind, so long as you get your work done and proactively communicate. You can also work fully remote from anywhere in the world for 1 month each year • Financial benefits: We pay competitive market salaries and provide staff with generous paid time off and holiday schedules. Certain staff at Sleek are also eligible for our employee share ownership plan and can share in the upside of our stellar growth trajectory as we work toward listing on a prominent stock exchange in the Asia Pacific region. • Personal growth: You’ll get a lot of responsibility and autonomy at Sleek - we move at a fast pace so you’ll be making decisions, making mistakes and learning. There’s also a range of internal and external facing training programmes we run. We’re also at the forefront of utilising AI in our space and are developing a regional centre of AI excellence. It is our intention that if you leave Sleek, you leave as a more well-rounded person and professional. • Sleek is also a proudly certified B Corp. Since we started our journey in 2017, we’ve been committed to building Sleek as a force for good. In just over 5 years, we’ve joined a community of industry leaders like Patagonia, Ben & Jerry's, and P&G who are building an inclusive, equitable, and a regenerative economy.

Apply Now

Similar Jobs

🔥 8 hours ago

Mobile DevOps & Release Engineer on the founding team for power-quality analysis software. Responsible for CI/CD, observability, and release lifecycle of mobile applications.

Android

AWS

Azure

Cloud

Dart

Flutter

iOS

Vault

🔥 16 hours ago

Empower

10,000+ employees

💸 Finance

💳 Fintech

👥 B2C

Senior Site Reliability Engineer driving reliability initiatives across critical financial services infrastructure. Mentoring engineers and implementing highly available systems with extensive AWS knowledge.

AWS

Kubernetes

Python

Terraform

Go

🕒 Yesterday

Gruve

201 - 500

🤖 Artificial Intelligence

🔒 Cybersecurity

🏢 Enterprise

Release Manager coordinating DevOps release lifecycles for enterprise applications at Gruve, focused on improving CI/CD. Collaborating with teams across APAC and EMEA for predictable and safe releases.

AWS

Azure

Cloud

Docker

Google Cloud Platform

Jenkins

Kubernetes

🕒 Yesterday

Merative

1001 - 5000

⚕️ Healthcare Insurance

☁️ SaaS

🤖 Artificial Intelligence

Developer for Dev-Ops Engineering at Merative transforming healthcare data into actionable insights. Collaborating across teams to improve production operations and drive Site Reliability principles.

Ansible

Azure

Cloud

Groovy

Java

Jenkins

Linux

MySQL

Node.js

Oracle

Postgres

Python

Selenium

Unix

🕒 Yesterday

HBK - Hottinger Brüel & Kjær

1001 - 5000

🚀 Aerospace

⚡ Energy

Site Reliability Engineer developing and operating internal developer platform at HBK. Collaborating with global teams to deliver reliable scalable systems and optimize cloud governance.

AWS

Azure

Cloud

Docker

Google Cloud Platform

Grafana

Jenkins

Kubernetes

Prometheus

Python

Terraform

Go