Senior Site Reliability Engineer

Job not on LinkedIn

October 29

Apply Now
Logo of Fellow.app

Fellow.app

SaaS • Productivity • Artificial Intelligence

Fellow. app is an advanced AI meeting assistant designed to streamline every aspect of the meeting process. The platform provides features like AI-enabled meeting transcription, action items organization, meeting agendas, and meeting minutes, all integrated with over 50 productivity tools such as Google, Slack, Asana, and Zoom. With an emphasis on enhancing productivity and collaboration, Fellow. app allows for seamless interaction before, during, and after meetings, catering to a wide range of organizational needs including team meetings, one-on-ones, and cross-functional sessions. The AI-driven tools aim to optimize meeting outcomes by helping teams stay organized, accountable, and aligned across various projects and objectives.

51 - 200 employees

☁️ SaaS

⚡ Productivity

🤖 Artificial Intelligence

💰 $24M Series A on 2021-10

📋 Description

• Design, implement, and manage reliable, scalable systems to support Fellow’s AI Meeting Assistant and other platform features. • Optimize and maintain our AWS infrastructure, including EC2, RDS, and other cloud services. • Oversee and optimize Kubernetes clusters to ensure high availability and performance. • Enhance and maintain CI/CD pipelines to support efficient, high-quality deployments. • Set up and improve monitoring, logging, and alerting systems to detect and resolve issues proactively. • Work closely with the engineering, product, and QA teams to support feature development and deployment. • Use tools like Pulumi to automate infrastructure provisioning and management. • Lead root cause analysis and implement changes to prevent future incidents. • Experiment with and adopt new technologies to enhance system performance and scalability.

🎯 Requirements

• 2+ years of experience in site reliability engineering or a related field, with a strong understanding of cloud-based infrastructure. • Proficiency with Kubernetes, AWS, and databases. • Experience with monitoring and observability tools such as Prometheus, Grafana, or Datadog. • Familiarity with CI/CD tools like GitHub Actions, Jenkins, or GitLab CI. • Strong problem-solving skills and a proactive approach to reliability challenges. • Excellent communication skills and the ability to collaborate effectively in a team environment. • Bonus: Experience with Pulumi, ElasticSearch, or MLOps tools is highly valued.

🏖️ Benefits

• Team Culture: Join a collaborative, innovative team that values continuous learning and growth. • Impact: Work on meaningful projects that shape the future of work and make meetings more productive. • Flexibility: We’re a remote-first organization with offices and co-working spaces available in Ottawa (our HQ), Montreal, and Toronto for those who prefer in-person collaboration. • Growth: Be part of a growing, Series A-funded startup backed by leading venture capital firms such as Craft, iNovia, and Felicis.

Apply Now

Similar Jobs

October 28

Hopper

201 - 500

Site Reliability Engineer for Hopper's Platform Infrastructure team, enhancing cloud foundation and automating processes. Supporting developers in a remote-first environment with a focus on operational excellence.

Cloud

Distributed Systems

DNS

Google Cloud Platform

Kubernetes

NoSQL

Python

SQL

Terraform

October 22

Senior DevOps Engineer designing and automating AWS cloud infrastructure for Linea. Collaborating with top engineers in a remote environment.

AWS

Cloud

EC2

Grafana

Kubernetes

Node.js

Prometheus

Python

Terraform

Web3

Go

October 21

Site Reliability Engineer responsible for building and maintaining libraries and infrastructure at Circle. Collaborating with teams to enhance software shipping experience and support rapid development.

AWS

Azure

Cloud

Google Cloud Platform

Java

Kubernetes

Microservices

SQL

Go

October 17

Senior DevOps Engineer specializing in cloud technologies at Cyderes. Responsible for maintaining system stability and leading initiatives for improvement.

Ansible

AWS

Azure

Chef

Cloud

Cyber Security

Docker

Google Cloud Platform

Grafana

Jenkins

Kubernetes

MySQL

Postgres

Prometheus

Puppet

SaltStack

SDLC

Spinnaker

SQL

Terraform

VMware

October 16

Site Reliability Engineer joining Tecsys responsible for optimizing and maintaining performance in mission-critical SaaS environments. Collaborating with teams to drive automation and incident management.

Ansible

AWS

Cloud

EC2

Java

Jenkins

Kubernetes

Python

Terraform

Built by Lior Neu-ner. I'd love to hear your feedback — Get in touch via DM or support@remoterocketship.com