Staff Site Reliability Engineer

Job not on LinkedIn

November 13

Apply Now
Logo of FloSports

FloSports

FloSports is a global independent sports media company delivering live events, award-winning original content, always-on social experiences, and comprehensive sports data solutions to passionate and underserved sports communities interested in more than 25 different sports including motorsports, wrestling, grappling, hockey, cheer, track & field, along with a wide array of NCAA athletics.

201 - 500 employees

📋 Description

• Lead the technical architecture and execution of our landmark migration from a legacy GCP environment to a modern, scalable infrastructure on AWS EKS. • Architect, design, and drive our core infrastructure, defining the patterns for Terraform and GitOps that the rest of the organization will follow. • Champion and drive our SLO-driven culture, setting the strategy for how we define, measure, and implement SLOs for critical user journeys, guided by the four Golden Signals (Latency, Traffic, Errors, and Saturation). • Lead the design and development of critical tooling and automation in Node.js and Go to solve entire classes of problems for our developers. • Lead the architectural evolution of our in-house, K6-based load testing platform, ensuring it can scale to meet future product demands. • Act as a primary subject matter expert for our Istio service mesh, driving its architecture, adoption, and optimization. • Spearhead and own high-priority initiatives, including the development of agentic workflows and intelligent automation for SRE domains like proactive scaling and automated remediation. • Act as a technical leader by participating in our blameless on-call rotation, mentoring other engineers through complex incidents and ensuring all post-mortems lead to systemic, long-term improvements.

🎯 Requirements

• Extensive Experience: 8-10+ years in SRE, DevOps, or Software Engineering, with a proven track record of operating at a Staff level. • Proven Technical Leadership: You have a history of mentoring other senior engineers, influencing technical direction across multiple teams, and leading large-scale projects to completion. • Expert Coder: You are a polyglot with deep expertise in languages like Node.js or Go and a history of building and maintaining critical automation and services. • Kubernetes Architect: You have an expert-level, architectural understanding of Kubernetes (EKS preferred), including networking, custom controllers, and control plane optimization. • Infrastructure as Code Expert: You are a Terraform expert who has designed and implemented large-scale, reusable, and secure IaC frameworks, not just consumed them. • Observability Architect: You have designed and implemented observability strategies from the ground up, leveraging platforms like Datadog to create actionable SLOs and provide deep system insight. • CI/CD Architect: You have designed, built, and scaled complex CI/CD systems (ideally with GitHub Actions and self-hosted runners) that are used by an entire engineering organization. • Expert Systems Thinker: You can decompose highly ambiguous, complex, cross-functional problems into solvable parts and lead the technical solution from concept to production.

🏖️ Benefits

• Recognized three years in a row as a Top Workplace by the Austin-American Statesman • Flexibility at work - you can take control of your profession and personal schedule • All-hands events hosted in beautiful Austin, Texas • Annual equity awards for all top performers • Competitive and comprehensive medical, dental and vision plans • Peace of mind through company-paid short-term disability, long-term disability and life insurance • Generous 401(K) company match vested immediately • Progressive parental leave policies • Flexible paid time off • Hack-a-thons and a full calendar of team-building and social events • Company donation to youth teams and leagues that our employees coach • Stocked snack bar, catered lunch and breakfast tacos every week

Apply Now

Similar Jobs

November 12

RTX

10,000+ employees

🚀 Aerospace

Site Reliability Engineer managing AWS infrastructures for connected aviation services. Collaborating with teams to ensure service availability and enhanced product features for commercial clients.

AWS

Cloud

Docker

Postgres

Python

RabbitMQ

SQL

.NET

November 12

Director of DevOps and Product Security at DDN, driving automation and scalability for AI data solutions. Leading cross-functional teams in cloud environments with security integration.

Ansible

AWS

Azure

Cloud

Google Cloud Platform

Jenkins

Terraform

November 11

Building observability stack for satellite and aerospace networking platforms at Aalyria. Transforming interconnectivity of satellite systems and overseeing critical operational reliability.

AWS

Cloud

Distributed Systems

Google Cloud Platform

Grafana

Java

Kubernetes

Prometheus

Python

Terraform

Go

November 6

Galaxy VP, Site Reliability Engineer in charge of AWS and containerized infrastructure. Focusing on automation, reliability, and cloud best practices.

AWS

Cloud

Grafana

Kubernetes

Prometheus

Terraform

November 5

AWS DevOps Engineer designing cloud-native applications for SAP S/4HANA processes. Optimizing AWS cost/performance in fully remote work environment.

AWS

Cloud

DynamoDB

Kafka

Built by Lior Neu-ner. I'd love to hear your feedback — Get in touch via DM or support@remoterocketship.com