Senior Live-Ops Site Reliability Engineer

Job not on LinkedIn

November 13

Apply Now
Logo of eDynamic Learning

eDynamic Learning

eDynamic Learning, a company founded by a classroom teacher, is dedicated to supporting educators with equitable and accessible instructional materials, including full-course curriculum spanning grades 6-12 as well as supplemental resources, as they prepare learners for life after graduation. We support teachers and programs that help students explore their interests and career options, acquire skills through career and technical education (CTE), and develop life readiness skills such as interpersonal communication, financial literacy, and more. We take pride in the fact that our solutions and services are designed to support educators as they guide students on a transformative journey of exploration, engage them in learning, and enable them to partake in real-world experiences.

51 - 200 employees

Founded 2008

📋 Description

• Own the availability, reliability, and performance of production systems and services • Design and maintain scalable infrastructure to support high-traffic educational applications • Build monitoring, alerting, and observability systems to proactively detect and resolve issues • Lead incident response and postmortem processes to improve resilience and reduce downtime • Develop automation tools and scripts to streamline deployments, operations, and recovery • Collaborate closely with engineering and DevOps teams to design and implement fault-tolerant systems • Continuously refine CI/CD pipelines and deployment processes for speed and safety • Champion best practices in infrastructure-as-code (IaC), security, and configuration management • Partner with development teams to ensure reliable service releases and smooth rollouts • Analyze capacity trends and system performance to plan for future growth • Mentor junior engineers and contribute to an operational culture of transparency, ownership, and continuous learning

🎯 Requirements

• Bachelor’s Degree in Computer Science or equivalent experience • 8+ years of experience in systems engineering, DevOps, or Site Reliability Engineering roles • Proven experience managing mission-critical, high-availability production environments • Strong background in Linux systems administration and performance tuning • Expertise with AWS infrastructure and related services • Proficiency with Docker, Kubernetes, and infrastructure-as-code tools such as Terraform or CloudFormation • Solid programming/scripting skills in Python, Bash, or similar • Experience with CI/CD pipelines, deployment automation, and Git-based workflows • Deep understanding of networking, HTTP, and distributed systems principles • Familiarity with monitoring and observability tools (Datadog, Prometheus, Grafana, etc.) • Legally eligible to work in Canada and/or the U.S.

🏖️ Benefits

• Professional development • Remote work options

Apply Now

Similar Jobs

November 13

Senior Site Reliability Engineer at Innosphere developing reliable cloud infrastructure. Collaborating with product teams and providing mentorship in DevSecOps best practices.

AWS

Splunk

Terraform

November 11

Senior DevOps Engineer for remote-first product studio helping clients with cloud solutions. Delivering robust CI/CD pipelines and secure infrastructure with modern tools.

AWS

Cloud

Docker

Firewalls

Google Cloud Platform

JavaScript

Kubernetes

Node.js

Python

Terraform

Go

November 10

DevOps Engineer at Veriforce responsible for deploying, monitoring, and optimizing production applications. Collaborates with teams using modern tools like AWS, Terraform, and Kubernetes.

🗣️🇫🇷 French Required

Ansible

AWS

Chef

Cloud

EC2

Jenkins

Kafka

Kubernetes

Linux

Puppet

November 10

Release Engineer supporting build quality and release consistency for MetaMask mobile and extension applications. Collaborating with teams to streamline CI/CD processes and improve tooling.

JavaScript

TypeScript

November 8

Senior Site Reliability Engineer building infrastructure for Tempo's SaaS solutions. Collaborating and automating processes to ensure product reliability and performance.

AWS

Cloud

Kubernetes

Built by Lior Neu-ner. I'd love to hear your feedback — Get in touch via DM or support@remoterocketship.com