Principal Site Reliability Engineer

Job not on LinkedIn

🔥 0 minutes ago

Apply Now
Find Similar Remote Jobs

📊 Check your resume score for this job

Improve your chances of getting an interview by checking your resume score before you apply.

Logo of Copper.co

Copper.co

201 - 500 employees

💸 Finance

💳 Fintech

₿ Crypto

Finance • Fintech • Crypto

Copper. co is a leader in digital asset infrastructure, providing institutional-grade custody, prime services, and collateral management for a wide range of clients. The company has developed cutting-edge technology to ensure the security and speed of cryptocurrency transactions, positioning itself as a dominant player in the space. Copper. co's innovative ClearLoop network allows trading on centralized exchanges without moving assets from its secure custody, reducing counterparty risk for institutional investors. The company serves hedge funds, trading firms, exchanges, and more, offering services such as staking, treasury management, and wallets-as-a-service. With its award-winning solutions, Copper. co is trusted by over 1,000 organizations worldwide to manage and safeguard digital assets.

📋 Description

• Shape SRE; Define how we think about reliability, observability, and operational excellence. Drive the adoption of SRE principles across the organization while building the systems and processes that make those principles measurable – think SLIs, SLOs and error budgets. • Scale Through Automation; Champion architectural improvements that enhance both system reliability and deployment velocity. Provide consultation on system architecture, building reusable platforms and frameworks, planning capacity needs, and conducting production readiness reviews to ensure services launch and operate successfully. • Drive Technical Excellence; Engage in and improve the lifecycle of microservices, from inception through deployment, operation, observability, and continuous refinement. • Lead Through Influence; Partner with engineering and product leadership to embed reliability into our product development lifecycle. Conduct blameless postmortems and drive systemic improvements in incident management. Mentor engineers across the organisation on SRE practices, helping teams take ownership of their service reliability.

🎯 Requirements

• Experience in designing, analysing, and troubleshooting distributed systems or micro-services architectures. • Established expertise in observability and incident management. • Proven experience in driving organizational Change • Excellent communication skills, with a systematic problem-solving approach. • Experience working with production workloads in AWS • Experience working in financial services or similarly regulated environments • Interest in blockchain based technologies and/or ‘decentralised’ finance • Master's degree in Computer Science or Engineering.

🏖️ Benefits

• 35 Days paid time off per annum, inclusive of annual leave and public holidays. Employees also receive one additional day of annual leave for each year of service. • Private Health Insurance

Apply Now

Similar Jobs

🕒 May 20

Replit

51 - 200

Join Replit as a Staff Site Reliability Engineer, enhancing performance and reliability of our infrastructure. Collaborate to ensure scalable solutions while mentoring engineers.

Cloud

Distributed Systems

Kubernetes

Python

Terraform

Go

🕒 February 17

Thrill

11 - 50

🎮 Gaming

🥽 AR/VR

Infrastructure/DevOps Engineer responsible for managing AWS and Kubernetes at Thrill Labs. Working on high-scalability projects and improving security measures in a fast-growing tech startup.

AWS

Kubernetes

Linux

Postgres

Python