Staff Site Reliability Engineer

eCommerce • Logistics • Software

Stord is a company that provides cloud supply chain solutions for direct-to-consumer (DTC) and omnichannel brands, specializing in fulfillment, last-mile delivery, and inventory management. By integrating best-in-class software with physical logistics operations, Stord aims to enhance the consumer experience from pre-purchase to post-delivery. The company supports over $5 billion in commerce annually and offers services such as order management, shipment protection, and inventory planning to streamline supply chains and improve brand loyalty. Stord also serves a wide range of industries, including health & beauty, nutrition & supplements, and apparel & accessories.

501 - 1000 employees

Founded 2019

🛍️ eCommerce

Staff Site Reliability Engineer

Job not on LinkedIn

October 11

🇺🇸 United States – Remote

⏰ Full Time

🔴 Lead

⛑ DevOps & Site Reliability Engineer (SRE)

🦅 H1B Visa Sponsor

Cloud

Distributed Systems

Docker

Google Cloud Platform

Grafana

Java

Jenkins

Kubernetes

Prometheus

Python

Terraform

Apply Now

Stord

eCommerce • Logistics • Software

501 - 1000 employees

Founded 2019

🛍️ eCommerce

📋 Description

• Lead architecture decisions to deliver scalable and reliable infrastructure, primarily on Google Cloud Platform (GCP) • Implement Infrastructure as Code (IaC) using Terraform, CloudFormation, Pulumi, or similar • Manage containerized environments with Docker and Kubernetes • Drive system performance tuning, capacity planning, and resource optimization • Define and maintain Service Level Objectives (SLOs) and Indicators (SLIs) • Build robust monitoring, alerting, and observability solutions using Prometheus, Grafana, DataDog, or New Relic • Develop and maintain disaster recovery and business continuity strategies • Design and maintain CI/CD pipelines (Jenkins, GitLab CI, GitHub Actions, etc.) • Automate operational workflows and infrastructure provisioning • Provide escalation support for production incidents and lead post-incident reviews • Conduct technical design reviews and offer architectural guidance • Mentor junior engineers on SRE and infrastructure best practices

🎯 Requirements

• 8+ years of experience in site reliability, platform engineering, or infrastructure roles with leadership exposure • Proficiency in at least one programming language (Python, Go, Java, etc.) • Strong hands-on experience with GCP and its core services • Expertise in containerization (Docker) and orchestration (Kubernetes) • Deep knowledge of Infrastructure as Code (Terraform, CloudFormation, etc.) • Skilled in monitoring/observability (Prometheus, Grafana, ELK, etc.) • Solid understanding of networking, load balancing, and distributed systems • Experience with Git and collaborative development workflows

🏖️ Benefits

• Flexible work arrangements • Professional development • Equipment allowances

Apply Now

Similar Jobs

Staff Site Reliability Engineer

October 9

AlphaSense

1001 - 5000

🤖 Artificial Intelligence

💸 Finance

🏢 Enterprise

Staff Site Reliability Engineer architecting reliability and performance for AlphaSense's market intelligence services. Leading SRE practices and enhancing system reliability across engineering teams.

🇺🇸 United States – Remote

💵 $150k - $225k / year

💰 Debt Financing on 2022-06

⏰ Full Time

🔴 Lead

⛑ DevOps & Site Reliability Engineer (SRE)

🦅 H1B Visa Sponsor

AWS

Azure

Cloud

DNS

Google Cloud Platform

Grafana

Kubernetes

Prometheus

Python

TCP/IP

Professional Services DevOps Architect

October 7

JFrog

1001 - 5000

🏢 Enterprise

☁️ SaaS

🔐 Security

Professional Services DevOps Architect guiding strategic customers’ DevOps journeys at JFrog. Collaborating with various teams to implement CI/CD pipelines and DevSecOps platforms.

🇺🇸 United States – Remote

💵 $180k - $200k / year

⏰ Full Time

🟠 Senior

🔴 Lead

⛑ DevOps & Site Reliability Engineer (SRE)

🦅 H1B Visa Sponsor

Ansible

AWS

Azure

Cloud

Docker

Google Cloud Platform

Java

Jenkins

Kubernetes

Linux

Maven

Open Source

Terraform

Staff DevOps Engineer – Platform Operations

October 6

Kin

2 - 10

Staff DevOps Engineer developing infrastructure and delivery solutions at Kin's platform operations team. Enhancing deployment efficiency and supporting secure innovation for engineers.

🇺🇸 United States – Remote

💵 $140k - $180k / year

⏰ Full Time

🔴 Lead

⛑ DevOps & Site Reliability Engineer (SRE)

🦅 H1B Visa Sponsor

AWS

Docker

EC2

Redis

Shell Scripting

Terraform

Principal Site Reliability Engineer

October 3

Expel

201 - 500

🔒 Cybersecurity

☁️ SaaS

Principal Site Reliability Engineer at Expel focusing on service reliability through collaboration and coding. Leading projects on platform features and mentoring junior engineers in high-availability systems.

🇺🇸 United States – Remote

💵 $167.3k - $242.6k / year

⏰ Full Time

🔴 Lead

⛑ DevOps & Site Reliability Engineer (SRE)

🦅 H1B Visa Sponsor

AWS

Cloud

Google Cloud Platform

JavaScript

Kubernetes

Linux

Python

Principal Site Reliability Engineer

October 1

Blue River Technology

🌾 Agriculture

🤖 Artificial Intelligence

🔧 Hardware