Site Reliability Engineer

Job not on LinkedIn

September 26

Apply Now
Logo of ONE

ONE

Fintech

ONE is a financial technology company that focuses on providing innovative banking services through a mobile application. Although not a traditional bank, ONE offers various banking services in partnership with Coastal Community Bank. The company's offerings include cash back rewards, credit building tools, early direct deposit options, and high-yield savings accounts. ONE also features a digital wallet for shopping benefits, especially at Walmart. With an emphasis on user-friendly online banking, ONE aims to help customers take control of their credit, manage their finances, and earn rewards without the typical fees associated with traditional banks.

201 - 500 employees

💳 Fintech

📋 Description

• Ensure stability, scalability, and security of systems powering OnePay's financial products for millions of customers • Design, build, and maintain scalable infrastructure and tooling to improve reliability, performance, and availability across the platform • Contribute to the evolution of observability stack, platform libraries, cloud architecture, and CI/CD pipelines • Develop automation and monitoring systems to detect, prevent, and remediate incidents before they impact customers • Partner closely with product and platform engineering teams to embed reliability best practices in design, development, and deployment • Lead root cause analysis and postmortems, driving long-term improvements in resiliency and fault tolerance

🎯 Requirements

• 5+ years of experience as a Software Engineer focused on building and running reliable, large-scale, distributed systems in production • 5+ years of operational experience in observability tooling and libraries (metrics, logging, tracing); experience using Datadog or similar tools (Prometheus, Grafana) • Proficiency in at least one programming language (Python, Go, Java, or Node.js preferred) for automation and tooling • Proficiency in incident management, going on-call, and writing post-mortem reports • Excellent collaboration skills with the ability to influence and educate product engineering teams on reliability and observability best practices • Hands-on experience with cloud platforms (AWS preferred), container orchestration (Kubernetes), and IAC tools (Terraform, Pulumi) • Drive and proactivity; builder and executor mindset • Familiarity with functional programming concepts and fp-ts/TypeScript is a plus • Authorization to work in the United States (application asks about work authorization and sponsorship)

🏖️ Benefits

• Competitive base salary • Stock options • Health benefits from Day 1 • 401(k) plan with company match • Remote-friendly (US) • Flexible time off (FTO) • Opportunities for growth • Inclusive, mission-driven culture

Apply Now

Similar Jobs

September 25

DevOps Engineer at SOFTGIC S.A.S.; manage AWS EKS, Terraform, CI/CD and observability to ensure scalable, reliable cloud services.

AWS

Docker

EC2

Kubernetes

Linux

Terraform

September 24

Senior DevOps Engineer automating CI/CD, deploying Terraform and Kubernetes on AWS for Veeva Systems; mentors teams and participates in on-call.

Ansible

AWS

Chef

Cloud

Distributed Systems

Jenkins

Kubernetes

SaltStack

Terraform

September 24

Senior DevOps Engineer automating CI/CD, building infrastructure and scaling AWS deployments for Veeva's life sciences cloud; mentors teams and maintains observability and on-call systems.

Ansible

AWS

Chef

Cloud

Distributed Systems

Jenkins

Kubernetes

SaltStack

Terraform

September 24

ContainIQ

2 - 10

Remote Site Reliability Engineer at ContainIQ maintaining cloud-native observability platform; job description coming soon; contact careers email.

September 24

Lead design, deployment, and validation of Wi‑Fi and private/public LTE networks for Teleo’s autonomous machines. Travel to customer sites 50%.

Linux

Python

Built by Lior Neu-ner. I'd love to hear your feedback — Get in touch via DM or support@remoterocketship.com