Senior SRE II

🔥 15 hours ago

Apply Now
Find Similar Remote Jobs

📊 Check your resume score for this job

Improve your chances of getting an interview by checking your resume score before you apply.

Logo of Filevine

Filevine

201 - 500 employees

☁️ SaaS

🤖 Artificial Intelligence

💰 $108M Series D on 2022-04

SaaS • Legal • Artificial Intelligence

Filevine is a comprehensive legal technology platform that offers a wide range of services tailored to law firms and legal practitioners. The platform provides solutions for case management, document management, and contract management, as well as lead management and business analytics. Utilizing advanced AI technologies, Filevine enhances the legal workflow with tools such as DemandsAI, ImmigrationAI, and FilevineAI to automate tasks and improve productivity. It integrates seamlessly with popular tools like QuickBooks and Gmail, ensuring a complete legal tech stack. Filevine also offers eSignature capabilities, time and billing features, and a client portal to facilitate communication. The platform is utilized by various types of law practices, including personal injury, family law, mass torts, and more, and is recognized for its robust security standards and compliance certifications such as SOC 2 Type II and HIPAA.

📋 Description

• Own and evolve observability strategy, including monitoring, alerting, dashboards, logging, and distributed tracing. • Define and manage SLIs, SLOs, and reliability metrics. • Lead incident response, postmortems, and continuous improvement initiatives. • Improve MTTD and MTTR through automation and operational excellence. • Integrate observability into CI/CD pipelines and software delivery workflows. • Build and maintain reliable cloud infrastructure on AWS and Kubernetes. • Mentor engineers and promote SRE best practices across the organization

🎯 Requirements

• 8+ years of experience in software engineering, infrastructure, or operations. • 5+ years of Site Reliability Engineering experience. • Deep expertise with observability platforms such as New Relic, Datadog, Dynatrace, Grafana, or Prometheus. • Strong experience with monitoring, alerting, incident management, and reliability engineering practices. • Hands-on experience with AWS, Kubernetes, and cloud-native technologies. • Proficiency in Python, Bash, PowerShell, or similar scripting languages.

🏖️ Benefits

• Medical, Dental, & Vision Insurance (for full-time employees) • Competitive & Fair Pay • Maternity & paternity leave (for full-time employees) • Short & long-term disability • Opportunity to learn from a dedicated leadership team • Top-of-the-line company swag

Apply Now

Similar Jobs

🔥 15 hours ago

Simple Technology Solutions

51 - 200

🏛️ Government

🤖 Artificial Intelligence

DevSecOps/Cloud Engineer responsible for deployment infrastructure and security controls for federal cloud platform. Collaborating on CI/CD pipelines using AWS services and ensuring regulatory compliance.

AWS

Cloud

EC2

SDLC

🔥 18 hours ago

Veeam Software

1001 - 5000

☁️ SaaS

🔒 Cybersecurity

🏢 Enterprise

Site Reliability Engineer supporting Veeam Data Cloud, working on reliability practices and cloud infrastructure for government compliance. Collaborating with senior engineers and maintaining operational foundation.

Azure

Cloud

Distributed Systems

Grafana

Java

JavaScript

Kubernetes

Prometheus

Terraform

TypeScript

Go

🔥 21 hours ago

Armada

51 - 200

📡 Telecommunications

🤖 Artificial Intelligence

🏢 Enterprise

Deployment Engineer managing modular data center deployments for Armada. Executing installation, troubleshooting, and operational readiness activities in North America.

🔥 21 hours ago

The Leaflet

11 - 50

🔌 API

Senior Site Reliability Engineer optimizing Java applications while pioneering AI-driven operations for high-traffic environments. Collaborating with teams to enhance reliability and performance across distributed systems.

Ansible

AWS

Azure

Cloud

Google Cloud Platform

Grafana

Java

Kubernetes

Prometheus

Python

Terraform

Go

🔥 21 hours ago

HavocAI

11 - 50

🤖 Artificial Intelligence

🔐 Security

🔧 Hardware

Senior Site Reliability Engineer at HavocAI responsible for reliability architecture and incident management. Ensuring performance, resilience, and operational maturity of mission-critical cloud services.

Cloud

Distributed Systems

Kubernetes

Linux

Python

Go