Senior Site Reliability Engineer, CCIP

🔥 1 hour ago

Apply Now
Find Similar Remote Jobs

📊 Check your resume score for this job

Improve your chances of getting an interview by checking your resume score before you apply.

Logo of Chainlink Labs

Chainlink Labs

201 - 500 employees

Founded 2017

💸 Finance

💳 Fintech

🌐 Web 3

Finance • Fintech • Web 3

Chainlink Labs is a leading player in the field of decentralized finance (DeFi) and blockchain technology. The company is pioneering the use of decentralized systems to facilitate onchain transactions for financial institutions and marketplaces. By collaborating with financial market infrastructures, asset managers, and top DeFi protocols, Chainlink Labs is driving the transition to a tokenized asset economy and aims to become the global standard for onchain finance. With expertise in cryptography and a robust track record in security, Chainlink Labs provides a platform that powers a global system of onchain finance.

📋 Description

• Ensure the reliability, scalability, and operational excellence of the systems powering Chainlink's Cross-Chain Interoperability Protocol (CCIP). • Influence reliability practices across the platform and help establish operational standards that scale with the business. • Improve deployment safety and increase delivery velocity by advancing production engineering practices. • Establish distributed tracing across the platform to improve observability and accelerate incident investigation. • Eliminate operational toil through automation that increases engineering efficiency and platform reliability. • Drive adoption of meaningful SLOs, SLIs, and error budgets that guide engineering decisions and improve service health. • Increase platform scalability and operational readiness as CCIP continues to grow.

🎯 Requirements

• Demonstrated experience in Site Reliability Engineering, Production Engineering, or a similar role operating large-scale distributed systems. • Deep expertise defining, implementing, and driving adoption of SLOs, SLIs, and error budgets across engineering organizations. • Built and operated production Kubernetes environments supporting critical services. • Applied OpenTelemetry to improve observability across distributed systems. • Experience improving the reliability, scalability, and operability of production infrastructure.

🏖️ Benefits

• Long-term incentives • Comprehensive benefits

Apply Now

Similar Jobs

🔥 6 hours ago

Akamai Technologies

5001 - 10000

🔒 Cybersecurity

Senior Site Reliability Engineer optimizing next-gen AI hardware infrastructure for reliability and scalability. Collaborating with product teams to enhance system performance and uptime.

Cloud

Grafana

Prometheus

Python

Switching

🕒 2 days ago

General Dynamics Information Technology

10,000+ employees

🔒 Cybersecurity

🤖 Artificial Intelligence

Site Reliability Engineer blending software engineering, automation, and operations expertise. Building scalable platforms and enabling high-velocity delivery for critical Defense systems.

Cloud

Distributed Systems

Grafana

Kubernetes

Linux

Prometheus

Python

Splunk

🕒 2 days ago

NBCUniversal

10,000+ employees

📱 Media

DevOps Engineer at NBCUniversal working on video streaming automation solutions. Collaborating with teams for high-visibility projects in a fast-paced environment.

Ansible

AWS

Azure

Chef

Cloud

Cyber Security

Docker

Google Cloud Platform

Grafana

JavaScript

Kubernetes

Linux

Python

Splunk

Terraform

Unix

🕒 2 days ago

Site Reliability Engineer designing, building, and maintaining highly available systems for health technology company. Collaborating with software developers to improve reliability and automate processes.

🇺🇸 United States – Remote

💵 $130k - $160k / year

💰 Venture Round on 2020-07

⏰ Full Time

🟡 Mid-level

🟠 Senior

⛑ DevOps & Site Reliability Engineer (SRE)

AWS

Azure

Cloud

Google Cloud Platform

Kubernetes

Python

Ruby

Go

🕒 3 days ago

CACI International Inc

10,000+ employees

🔒 Cybersecurity

Cloud DevOps Engineer managing CI/CD pipelines and applications in AWS cloud. Collaborating on security initiatives and providing DevSecOps training with Agile teams.

Ansible

AWS

Cloud

Docker

EC2

Firewalls

Grafana

Java

JavaScript

Kubernetes

OpenShift

Prometheus

Python

SDLC

Splunk

Terraform

Go