Data Reliability Engineer

🕒 May 21

Apply Now
Find Similar Remote Jobs

📊 Check your resume score for this job

Improve your chances of getting an interview by checking your resume score before you apply.

Logo of Empower

Empower

10,000+ employees

💸 Finance

💳 Fintech

👥 B2C

Finance • Fintech • B2C

Empower is a leading provider of financial services focused on helping individuals and organizations achieve financial freedom through retirement planning and investment management. Serving over 19 million Americans, Empower offers a comprehensive suite of finance-related services, including smart planning and investment advice, and tools like the Empower Personal Dashboard™ for a complete financial view. The company is renowned as a top retirement plan provider and works closely with personal investors, workplace plan savers, plan sponsors, and financial professionals. Empower is also recognized for initiatives in Diversity, Equity, Inclusion, and has a social commitment that bolsters community impact.

📋 Description

• Own the reliability and stability of production data pipelines and data platform services. • Define, improve, and enforce data SLAs/SLOs for batch and streaming products, including freshness, latency, and completeness. • Diagnose and resolve data pipeline failures, delays, and data quality issues in production environments. • Investigate issues across distributed data systems, including Spark/EMR workloads, ingestion pipelines, and warehouse performance. • Lead or support incident response, including triage, mitigation, and long-term resolution. • Perform root cause analysis and implement durable fixes to prevent recurrence. • Design and enhance monitoring, alerting, and observability for data systems. • Develop automation and tooling to reduce operational toil and improve system resilience. • Contribute to disaster recovery and resiliency planning, including backup validation and recovery workflows. • Partner with engineering teams to improve pipeline design, reliability, and operational readiness. • Create and maintain runbooks, Standard Operating Procedures, and operational documentation. • Participate in occasional off-hours support for production data systems when required.

🎯 Requirements

• Bachelor’s degree in Computer Science, Information Systems, Data Science, or a related field • 5+ years of experience in data engineering or analytics platform roles, including 3+ years operating in a production cloud data warehouse environment such as Redshift or Snowflake • 3+ years of experience building AWS data pipelines and supporting them through production, including exposure to real-world failures and operational challenges • 3+ years of experience working with production data platforms in AWS environments, with a focus on anomaly detection, reconciliation, and end-to-end validation • 3+ years of experience with Python and SQL in real data systems • Hands-on experience troubleshooting distributed data processing systems such as Spark/EMR, Redshift, and streaming systems • Proven ability to debug and resolve production issues in data pipelines and data platforms • Experience with AWS data services such as EMR, Redshift, DynamoDB, S3, or similar • Proven ability to handle production incidents and perform root cause analysis • Strong problem-solving mindset and ability to work through ambiguous production issues

🏖️ Benefits

• Medical, dental, vision and life insurance • Retirement savings – 401(k) plan with generous company matching contributions (up to 6%), financial advisory services, potential company discretionary contribution, and a broad investment lineup • Tuition reimbursement up to $5,250/year • Business-casual environment that includes the option to wear jeans • Generous paid time off upon hire – including a paid time off program plus ten paid company holidays and three floating holidays each calendar year • Paid volunteer time — 16 hours per calendar year • Leave of absence programs – including paid parental leave, paid short- and long-term disability, and Family and Medical Leave (FMLA) • Business Resource Groups (BRGs) – BRGs facilitate inclusion and collaboration across our business internally and throughout the communities where we live, work and play. BRGs are open to all.

Apply Now

Similar Jobs

🕒 May 21

Button

51 - 200

☁️ SaaS

🛍️ eCommerce

🤝 B2B

Senior DevOps Engineer responsible for platform infrastructure management in a commerce-powered internet company. Collaborating with teams on scalable, stable, and operable solutions for business-critical systems.

AWS

Docker

DynamoDB

EC2

Google Cloud Platform

Grafana

JavaScript

Node.js

Prometheus

Python

Terraform

Go

🕒 May 20

High 5 Games

51 - 200

🎮 Gaming

🎲 Gambling

🤝 B2B

DevOps Engineer responsible for building and optimizing cloud infrastructure for machine learning operations in gaming. Collaborating with data scientists and ML engineers to ensure reliability and performance.

Ansible

BigQuery

Cloud

Docker

Google Cloud Platform

Groovy

Jenkins

Kubernetes

Python

Terraform

🕒 May 20

Copper Q8

11 - 50

📋 Compliance

🤝 B2B

DevSecOps and API management Platform Leader shaping secure platforms for digital innovation. Leading the development of automated and secure CI/CD pipelines in a global role.

AWS

Azure

Cloud

Docker

Google Cloud Platform

Grafana

Jenkins

Kubernetes

Microservices

Prometheus

Terraform

🕒 May 20

IEX

51 - 200

Systems Reliability Engineer ensuring reliable operations and automation of IEX's trading platform systems. Collaborating with engineering to optimize performance and troubleshoot complex issues.

Ansible

Distributed Systems

Linux

Python

TCP/IP

🕒 May 20

SouthState Bank

1001 - 5000

🏦 Banking

💸 Finance

💳 Fintech

Payment Platform DevOps Engineer at SouthState enabling secure and scalable delivery of cloud-based payment solutions. Collaborating with internal teams for innovation in payment technology.

ASP.NET

Azure

Cloud

Ruby on Rails

SDLC

SQL

Terraform

TypeScript

Vault

.NET