Senior Software Engineer, Alerting and Observability

October 7

Apply Now
Logo of Cribl

Cribl

SaaS • Cloud

Cribl is a company providing a cloud-based service, allowing users to manage and analyze their data through a web application. The service includes features for user accounts and integration with Google for authentication.

501 - 1000 employees

Founded 2017

☁️ SaaS

📋 Description

• Design and build sophisticated alerting systems that enable proactive monitoring and incident detection across distributed systems • Develop query-based alert rules and expressions using PromQL, SQL, and other query languages to surface meaningful insights • Create intelligent alert routing, deduplication, and correlation mechanisms to reduce noise and improve signal quality • Build scalable backend services for alert evaluation, notification delivery, and alert management workflows • Optimize time-series data storage and query performance for high-volume metrics and telemetry data • Develop intuitive interfaces for alert configuration, visualization, and management using React and modern frontend technologies • Collaborate with cross-functional teams to understand monitoring requirements and deliver comprehensive alerting solutions • Mentor and guide engineers on best practices for observability and alerting architecture

🎯 Requirements

• Strong proficiency in TypeScript/Node.js with a proven track record of building production-grade services • Experience with query languages for metrics and monitoring (PromQL, SQL, or similar) and ability to write complex queries for data analysis • Hands-on experience building or maintaining alerting systems, including rule evaluation engines and notification pipelines • Experience with time-series databases and columnar storage systems (ClickHouse experience is a plus) • Frontend development skills with React and modern JavaScript frameworks for building data visualization and management interfaces • Strong understanding of distributed systems, data structures, and algorithms • Experience with observability concepts including metrics, logs, traces, and their correlation • Ability to work independently with minimal supervision and a track record of learning quickly • Dedication to writing clean, maintainable, and well-tested code • Experience Prometheus ecosystem, including AlertManager • Background in building rule engines or expression evaluation systems • Experience with notification systems and integrations (PagerDuty, Slack, webhooks, etc.) • Familiarity with observability tools like Grafana, ELK stack, or similar solutions • Experience with CI/CD pipelines such as BitBucket, Jenkins, CircleCI, etc. • Understanding of alert fatigue mitigation strategies and intelligent alerting patterns • Experience with high cardinality data and performance optimization • Willingness to speak your mind and share ideas • Appreciation for humor and a love for goats • Comfort working remotely

🏖️ Benefits

• health, dental, vision, short-term disability, and life insurance • paid holidays and paid time off • fertility treatment benefit • 401(k) • equity • eligibility for a discretionary company-wide bonus

Apply Now

Similar Jobs

October 7

Senior Software Engineer II developing full-stack components working with Typescript, Angular, Node.js, SQL, and AWS at Cleo. Seeking a trusted expert with 7+ years of experience in software engineering.

Angular

AWS

JavaScript

Node.js

SQL

TypeScript

October 7

Twilio

5001 - 10000

Software Engineer developing cloud-based omni-channel solutions at Twilio. Collaborating with teams to build highly available, scalable services integrating multiple products.

AWS

Cloud

Google Cloud Platform

Java

NoSQL

SQL

Go

October 7

Twilio

5001 - 10000

Software Engineer helping Twilio build a low-code orchestration platform for customer engagement. Designing scalable Java services in AWS and collaborating across teams to deliver solutions.

Airflow

AWS

Cloud

Docker

Google Cloud Platform

Java

Kubernetes

NoSQL

SQL

October 7

Technical Lead managing a team in automation and process control systems for manufacturing and life sciences. Hands-on leadership with project delivery and customer engagement.

October 7

Lead Software Development Engineer developing Intelligent Virtual Assistant customer experiences using NLP and ML technologies at Experian. Responsible for software development, technical leadership, and performance optimization.

AWS

DynamoDB

Groovy

Java

Jenkins

Microservices

Python

Built by Lior Neu-ner. I'd love to hear your feedback — Get in touch via DM or support@remoterocketship.com