Senior Site Reliability Engineer, Workforce Identity

🔥 3 minutes ago

Apply Now
Find Similar Remote Jobs

📊 Check your resume score for this job

Improve your chances of getting an interview by checking your resume score before you apply.

Logo of Coinbase

Coinbase

1001 - 5000 employees

Founded 2012

₿ Crypto

💸 Finance

💳 Fintech

💰 $21.4M Post-IPO Equity on 2022-11

Crypto • Finance • Fintech

Coinbase is a leading cryptocurrency exchange platform that allows individuals and institutions to buy, sell, and trade various crypto assets such as Bitcoin and Ethereum. The company offers advanced trading tools, institutional solutions, and a self-hosted wallet for storing and managing cryptocurrencies. With a strong focus on security and transparency, Coinbase provides a trusted platform used by millions globally. It supports various features including staking, earning rewards, and spending crypto through their cards. Additionally, Coinbase provides developer tools and APIs for building onchain applications, making it a comprehensive hub for engaging in the crypto economy.

📋 Description

• Own the reliability and availability of corporate IAM systems end-to-end, including on-call rotation, incident response, root cause analysis, and blameless retrospectives to ensure 24x7 uptime for critical identity services. • Build and maintain CI/CD pipelines and automation tooling to streamline identity platform deployments, eliminate manual operational tasks, and progressively test changes before production rollout. • Partner with IT, Security, and Engineering teams as a subject matter expert on corporate IAM and DevOps tooling, advising on identity lifecycle, provisioning, SSO, MFA, and access control architecture. • Drive observability and disaster recovery maturity across IAM systems by defining metrics, implementing monitoring solutions, and designing recovery strategies that meet strict SLA requirements. • Strengthen documentation standards by creating and maintaining comprehensive runbooks, system configuration guides, and troubleshooting procedures across the identity platform lifecycle.

🎯 Requirements

• 5+ years of experience in site reliability engineering or infrastructure engineering, with hands-on ownership of identity and access management platforms (Okta, Duo, Auth0, EntraID, Ping, or similar). • Demonstrated success building tooling that solves IAM-specific problems including identity lifecycle and provisioning, SSO, MFA, and ABAC/RBAC at enterprise scale. • Proficiency in at least one modern programming language (Go, Python, Ruby, Java, or C#) and experience designing and maintaining CI/CD pipelines using Git-based workflows and IaC tools (Terraform or equivalent). • Track record managing cloud environments (AWS, GCP, or Azure) with infrastructure-as-code, including container orchestration and automated deployment frameworks. • Utilizes generative AI responsibly, maintaining human oversight to deliver business-ready outputs and drive measurable improvements in workflow efficiency, cost, and quality.

🏖️ Benefits

• Benefits (medical, dental, vision, 401(k))

Apply Now

Similar Jobs

🔥 1 hour ago

Aya Healthcare

5001 - 10000

⚕️ Healthcare Insurance

🎯 Recruiter

Lead the SRE team at Aya Healthcare for enhancing product reliability and operational efficiency. Manage incident responses and AI-native operations for a top healthcare workforce solutions provider.

AWS

Azure

Google Cloud Platform

🔥 2 hours ago

Offchain Labs

11 - 50

₿ Crypto

🌐 Web 3

Site Reliability Engineer at Offchain leading a movement in blockchain scalability and security. Tackling real-world challenges and transforming interactions with decentralized applications.

AWS

Azure

Cloud

Google Cloud Platform

Linux

Python

Shell Scripting

Go

🔥 3 hours ago

BeyondTrust

1001 - 5000

🔒 Cybersecurity

Cloud Operations Engineer monitoring, maintaining, and responding to incidents for BeyondTrust Cloud Service. Collaborating across teams to ensure service health and handling cloud environments.

AWS

Azure

Cloud

Distributed Systems

Docker

JavaScript

Kubernetes

Linux

Python

Terraform

🔥 5 hours ago

MKS2 Technologies

201 - 500

🤝 B2B

🔒 Cybersecurity

Site Reliability Systems Engineer working with monitoring tools to enhance VA's infrastructure reliability. Collaborating across teams to resolve outages and improve service quality for veterans.

AWS

Azure

Cloud

Java

JavaScript

Linux

Oracle

ServiceNow

Splunk

Unix

🔥 5 hours ago

VAST Data

501 - 1000

DevOps Engineer developing tools to enhance efficiency for the Sales Engineering team at an AI infrastructure company. Responsible for managing AWS services and backend applications.

Angular

AWS

DNS

Docker

EC2

GraphQL

JavaScript

Linux

MongoDB

Node.js

SCSS

Shell Scripting

Unix