Senior Site Reliability Engineer, SRE

September 14

Apply Now
Logo of OutSystems

OutSystems

Enterprise • Productivity • SaaS

OutSystems is a software company that provides a low-code application development platform. It allows organizations to develop, deploy, and manage enterprise-grade applications with minimal coding effort. By simplifying the process of application development, OutSystems helps businesses accelerate their digital transformation and improve productivity.

1001 - 5000 employees

Founded 2001

🏢 Enterprise

⚡ Productivity

☁️ SaaS

📋 Description

• Lead and onboard services and teams to the reliability tenets; • Establish and maintain Service Level Objectives (SLOs) and Service Level Agreements (SLAs); • Design and implement scalable, reliable, and secure infrastructure, while ensuring cloud-native best practices; • Collaborate with software development teams to ensure systems are resilient (observable, fault-tolerant, recoverable, scalable) and performant; • Implement monitoring, alerting, logging, and tracing solutions to detect and respond to incidents; • Lead incident response efforts, ensuring quick resolution and minimal downtime, and conduct RCA/post-mortems; • Automate every operational task, with a special focus on fast incident detection & recovery; • Foster a culture of continuous improvement and knowledge sharing; • Communicate effectively with stakeholders, providing updates on system reliability and performance; • Participate in on-call rotation to provide 24/7 support for production systems.

🎯 Requirements

• STEM degree (BSc, MSc, in Software Engineering/Computer Science or related fields); • 5+ years of experience in software development and/or operations; • Proficiency in at least one high-level programming language (C++, Python, Java, C#, etc.). • Strong troubleshooting and debugging skills. • Fluency in English and excellent communication skills. • Experience in any of the following is valued, but not fully required: Containerization technologies and orchestration platforms, mainly Kubernetes (CKA, CKAD, CKS certifications are valued); Experience with automation and Infrastructure as Code (IaC) tools, such as AWS CloudFormation, Terraform, Puppet, Chef, Spacelift, etc; Experience with Python, Go, Bash/Shell scripting, or other automation tools/languages; Familiarity with AWS services like EC2, RDS, ELB, CloudFront, Lambda, etc; Proficiency in monitoring and troubleshooting complex distributed systems; Experience with Grafana, ELK stack, Prometheus, or others; Strong understanding of designing resilient and fault-tolerant systems; Expertise in debugging complex distributed systems.

🏖️ Benefits

• A company that is always growing, changing, and innovating. • Real career opportunities. • Work colleagues that are as smart, hard-working, and driven as you. • Disrupting the status quo is in our DNA.

Apply Now

Similar Jobs

September 5

Smart Working

51 - 200

🤝 B2B

☁️ SaaS

🎯 Recruiter

DevOps Engineer operating and hardening AWS infrastructure for Smart Working. Leading deployments, automation, observability, incident response, and vulnerability management.

🇮🇳 India – Remote

⏰ Full Time

🟡 Mid-level

🟠 Senior

⛑ DevOps & Site Reliability Engineer (SRE)

September 5

Motive

1001 - 5000

🚗 Transport

🤖 Artificial Intelligence

🏢 Enterprise

Site Reliability Engineer scaling and automating Motive's AWS infrastructure for fleet operations. Ensuring high availability, monitoring, and deployment pipelines for customer-facing systems.

🇮🇳 India – Remote

⏰ Full Time

🟡 Mid-level

🟠 Senior

⛑ DevOps & Site Reliability Engineer (SRE)

September 2

NVIDIA

10,000+ employees

🤖 Artificial Intelligence

🎮 Gaming

Senior SRE at NVIDIA DGX Cloud operating GPU-accelerated Kubernetes clusters across major clouds. Ensuring reliability, observability, and incident response for production AI infrastructure.

August 28

Saaf Finance

2 - 10

🤖 Artificial Intelligence

💸 Finance

💳 Fintech

DevOps Engineer at Saaf Finance builds AI-driven mortgage infrastructure. Designs and maintains AWS-based platforms and CI/CD pipelines.

🇮🇳 India – Remote

⏰ Full Time

🟡 Mid-level

🟠 Senior

⛑ DevOps & Site Reliability Engineer (SRE)

August 27

RemoteStar

11 - 50

🤝 B2B

🎯 Recruiter

☁️ SaaS

DevOps Engineer supporting a company building scalable 3D AEC applications. Manage Azure infrastructure, CI/CD, containers, monitoring, and deployment automation.

🇮🇳 India – Remote

⏰ Full Time

🟡 Mid-level

🟠 Senior

⛑ DevOps & Site Reliability Engineer (SRE)

Developed by Lior Neu-ner. I'd love to hear your feedback — Get in touch via DM or support@remoterocketship.com