Senior Site Reliability Engineer, SRE

October 21

Apply Now
Logo of SiFi

SiFi

Fintech • SaaS • B2B

SiFi is a Saudi-based expense management platform that provides businesses with corporate cards (physical and virtual), real-time expense tracking, automated accounting, reimbursement workflows, and payments automation. Licensed by the Saudi Central Bank, SiFi offers a mobile app and platform tools for budgeting, spend controls, analytics, and integrations with business software, aiming to improve financial governance, efficiency, and security for organizations.

11 - 50 employees

Founded 2021

💳 Fintech

☁️ SaaS

🤝 B2B

📋 Description

• Maintain and evolve multi-region cloud infrastructure using Terraform-based Infrastructure as Code (IaC) • Operate and optimize Kubernetes (OKE) clusters running microservices, data pipelines, and workflow orchestration • Manage SQL Server backup/restore pipelines, DR testing, and performance optimization • Ensure high availability for .NET and Python applications hosted behind load balancers and WAF • Design and maintain cross-network connectivity (DRGs, LPGs, VCNs, subnets, and NSGs) • Build and maintain a centralized orchestration platform integrated with alerting and notification systems • Develop self-healing, monitoring, and auto-remediation scripts for infrastructure and databases • Implement logging, metrics, and tracing pipelines • Automate recurring operational tasks using Python, Bash, and PowerShell to reduce manual effort and improve reliability • Manage GitHub Actions and Octopus Deploy pipelines for backend and data services • Apply strong security principles — least privilege, network segmentation, secure credentials, and encrypted communications • Promote GitOps and Infrastructure-as-Code practices to ensure repeatable and traceable deployments • Collaborate with developers to embed reliability and resilience into every release • Lead incident response, run blameless post-mortems, and turn findings into lasting improvements • Partner closely with engineering teams to drive design and code-level reliability improvements • Conduct capacity planning, cost optimization, and system tuning for performance and scalability • Mentor engineers in automation, observability, and root-cause analysis best practices

🎯 Requirements

• 5+ years of experience in Site Reliability, DevOps, or Infrastructure Engineering • Solid understanding of networking, load balancing, and DNS • Proven ability to analyze incidents and automate resolution • Experience integrating alerting and monitoring systems with communication tools (e.g., Microsoft Teams or Slack) • Oracle Cloud Infrastructure (OCI) (compute, networking, storage, monitoring) • Kubernetes (OKE) — deployments, ingress controllers, autoscaling • Microsoft SQL Server — backup/restore automation, DR planning, performance tuning • Terraform — multi-region and cross-tenant infrastructure automation • Python & PowerShell — automation and system scripting

🏖️ Benefits

• Health insurance • Flexible working arrangements • Professional development

Apply Now
Built by Lior Neu-ner. I'd love to hear your feedback — Get in touch via DM or support@remoterocketship.com