Senior Site Reliability Engineer

Job not on LinkedIn

November 20

Apply Now
Logo of Omilia - Conversational Intelligence

Omilia - Conversational Intelligence

Artificial Intelligence • eCommerce • Customer Service

Omilia is a leader in Conversational AI, specializing in voice and chat solutions that enable natural, end-to-end customer interactions. Their Omilia Cloud Platform provides advanced AI-driven customer service tools, including real-time agent assistance, voice biometrics for fraud prevention, and data analytics to enhance customer insights. Serving industries such as finance, insurance, retail, automotive, and travel, Omilia focuses on automating customer service while ensuring a secure and personalized experience.

201 - 500 employees

Founded 2002

🤖 Artificial Intelligence

🛍️ eCommerce

📋 Description

• Ensure platform reliability and availability across production and pre-production environments through proactive monitoring, alerting, and automation. • First response for incidents, contribute to problem management and root cause analysis. • Supporting the development team's effort towards reliability, creating a solid reliability culture within the development lifecycle. • Develop troubleshooting documentation for production support resources. • Collaborate with Engineering teams to develop optimised and productive runbooks, operational documentation and automation of operational tasks. • Collaborate with development and cloud engineering teams to embed reliability and performance into the software delivery lifecycle. • Design, implement, and evolve observability solutions (metrics, logs, traces, dashboards) using tools such as Prometheus, Grafana, and ELK. • Participate in on-call rotations and continuously improve alert quality and response processes. • Champion a culture of reliability, performance, and continuous improvement across teams.

🎯 Requirements

• Bachelor's Degree or MS in Engineering or equivalent. • Experience in operating at least one container orchestration cluster (Kubernetes, Docker Swarm). • Experience developing or maintaining software for production services at scale. • Experience with ELK. • Experience with AWS. • Experience with Grafana/Prometheus stack. • Strong scripting skills (Bash, Python or Go). • Excellent communication skills. • Thinking out of the box and anticipating challenges. It is imperative we are not simply reactive; we must expect challenges and question technologies, procedures and thinking already in place. You will be expected to constantly review and challenge at all levels. • Versatility. We work with agile/lean methods. We'd much rather iterate and learn than assume we know all the answers. • Being a team player. You don't (always) work in isolation and are excited by the thought of using your team whilst involving product, experience design, engineering, and more in the process. • **Will be considered as a plus:** • Telephony knowledge (SIP, VoIP); • Experience in Linux Administration (RedHat, CentOS, AL); • Working knowledge in Configuration Management tools (Terraform, Ansible); • Experience with TCP/IP and general networking concepts; • RDBMS knowledge (MySQL, Postgres); • NoSQL knowledge (Redis).

🏖️ Benefits

• Fixed compensation; • Long-term employment with the working days vacation; • Development in professional growth (courses, training, etc); • Being part of successful cutting-edge technology products that are making a global impact in the service industry; • Proficient and fun-to-work-with colleagues; • Apple gear.

Apply Now

Similar Jobs

November 11

DevOps Engineer at ScalableOS driving automation solutions for various business challenges. Collaborating on infrastructure improvement, efficiency, and reliability using cutting-edge tools.

Azure

Cloud

Docker

Groovy

Jenkins

Kubernetes

Linux

Python

SQL

Terraform

November 5

DevOps Engineer at nXscale automating deployment pipelines and enhancing development workflows. Join a mission-driven team empowering solar professionals worldwide with clean energy solutions.

AWS

Cloud

Cyber Security

Django

JavaScript

MySQL

Python

React

Terraform

November 4

IT DevOps Engineer responsible for designing efficient CI/CD pipelines for AUMOVIO. Collaborating with development teams to ensure high-quality software delivery while supporting business needs.

Ansible

AWS

Azure

Cloud

Docker

Google Cloud Platform

ITSM

Kubernetes

Python

Terraform

September 9

Senior SRE building IaC, observability, and incident response for a Series A fintech payments platform. Automate infrastructure and scale production systems.

AWS

Cloud

Docker

EC2

Grafana

Java

Jenkins

Kubernetes

MySQL

Postgres

Prometheus

Python

Terraform

Go

September 9

Senior Site Reliability Engineer for Series A fintech payments company. Automate infrastructure, improve observability, and lead incident response for a high-traffic payment platform.

AWS

Cloud

Docker

EC2

Grafana

Java

Jenkins

Kubernetes

MySQL

Postgres

Prometheus

Python

Terraform

Go

Built by Lior Neu-ner. I'd love to hear your feedback — Get in touch via DM or support@remoterocketship.com