Senior Site Reliability Engineer

🕒 April 1

Apply Now
Find Similar Remote Jobs

📊 Check your resume score for this job

Improve your chances of getting an interview by checking your resume score before you apply.

Logo of CertifID

CertifID

11 - 50 employees

Founded 2017

💳 Fintech

🏠 Real Estate

🔒 Cybersecurity

💰 $12.5M Series A on 2022-05

Fintech • Real Estate • Cybersecurity

CertifID is a technology company that specializes in protecting against wire fraud, particularly in real estate transactions. Their platform offers solutions for verifying identities and validating bank details, protecting real estate agents, law firms, title agents, and home buyers/sellers from fraud. With services like fraud recovery and wire fraud insurance, CertifID ensures transactions are secure. The company partners with title companies and law firms to enhance the standard of care for clients, providing a robust insurance coverage to protect client funds. CertifID is highly rated for wire fraud protection, with user-friendly software that quickly and effectively prevents fraudulent transactions.

📋 Description

• Own and improve the reliability, availability, and performance of production systems while defining and operationalizing SLIs/SLOs and error budgets. • Design and implement autonomous and semi-autonomous AI agents for monitoring distributed systems and applications. Build agents capable of consuming multi-source observability data (metrics, logs, traces, etc.). • Participate in and help lead an on-call rotation, serving as an escalation point for major incidents and facilitating blameless postmortems. • Build automated workflows to eliminate manual work and design/maintain Infrastructure-as-Code with Terraform. • Improve metrics, logs, traces, and alerting using tools like Datadog or Prometheus to reduce noise and increase signal. • Partner with application teams to implement reliability best practices and mentor junior engineers to foster a culture of knowledge sharing.

🎯 Requirements

• 5+ years in SRE, DevOps, Platform Engineering, or Infrastructure Engineering. • Proven experience supporting production SaaS systems in Azure (preferred), AWS, or GCP. • Strong Linux, networking, and distributed systems troubleshooting skills. • Strong experience with containers and orchestration (Kubernetes/EKS/AKS). • Expertise with Infrastructure-as-Code (Terraform strongly preferred). • Strong scripting/programming skills in Python, Go, Bash, or C#/.NET. • Hands-on experience with Datadog, Prometheus/Grafana, or OpenTelemetry.

🏖️ Benefits

• Flexible vacation • 12 company-paid holidays • 10 paid sick days • No work on your birthday • Health, dental, and vision Insurance (including a $0 option) • 401(k) with matching, and no waiting period • Equity • Life insurance • Generous parental paid leave • Wellness reimbursement of $300/year • Remote worker reimbursement of $300/year • Professional development reimbursement • Competitive pay • An award-winning culture

Apply Now

Similar Jobs

🕒 March 31

Liatrio

51 - 200

🏢 Enterprise

☁️ SaaS

Senior DevOps Engineer consulting for Liatrio, a boutique firm in engineering delivery and people enablement. Driving DevOps transformations across diverse industries.

Ansible

Azure

Chef

Cloud

ElasticSearch

Grafana

Java

JavaScript

Jenkins

Kubernetes

Prometheus

Puppet

Python

Terraform

Go

.NET

🕒 March 31

Mactores

51 - 200

🏢 Enterprise

DevOps Engineer specializing in infrastructure automation at Mactores, enhancing cloud deployment practices for data solutions. Responsible for managing deployment cycles and operational best practices in cloud environments.

Ansible

AWS

Chef

Cloud

Consul

Grafana

Graphite

Jenkins

Kubernetes

Linux

Terraform

🕒 March 31

Shop Your Way

10,000+ employees

👥 B2C

💸 Finance

🛒 Retail

DevOps Engineer managing automation and deployment processes for software development at Shop Your Way. Collaborating to maintain the ShopYourWay platform and applications uptime.

AWS

Cloud

DNS

Firewalls

Grafana

Java

Jenkins

Linux

MySQL

NoSQL

PHP

Postgres

Python

SQL

VMware

.NET

🕒 March 31

Filevine

201 - 500

☁️ SaaS

🤖 Artificial Intelligence

Site Reliability Engineer at Filevine ensuring reliable and scalable legal AI systems. Collaborating with teams to enhance performance, security, and deployment processes across operations.

AWS

Cloud

EC2

Google Cloud Platform

Kubernetes

Python

SDLC

🕒 March 31

Espresso Systems

11 - 50

₿ Crypto

🌐 Web 3

DevOps Engineer assisting the development team in building infrastructure for the Espresso Network. Supporting production of sequencer software and deployment tooling for test networks.

AWS

Azure

Cloud

Google Cloud Platform

Linux

Terraform