Site Reliability Engineer II

Job not on LinkedIn

🔥 4 minutes ago

Apply Now
Find Similar Remote Jobs

📊 Check your resume score for this job

Improve your chances of getting an interview by checking your resume score before you apply.

Logo of Zigsaw

Zigsaw

11 - 50 employees

Founded 2016

Making Job-search and talent discovery simpler, faster and effective for Job-seekers.

📋 Description

• Ensuring the reliability, availability, and performance of production infrastructure and platform services • Operating and scaling Kubernetes platforms, including governance and support for multi-tenant workloads • Managing GitOps-based deployment workflows using ArgoCD and Helm • Supporting infrastructure provisioning and change management through Terraform/Terragrunt • Building and supporting CI/CD automation and deployment workflows using GitHub Actions • Participating in incident response, root cause analysis, and post-incident improvement initiatives • Reducing operational toil through scripting, tooling, and process automation • Advancing observability practices across logs, metrics, traces, dashboards, and alerting • Supporting secure secrets integration, IAM-aware operations, and platform guardrails • Partnering closely with application, security, and platform teams to improve reliability and delivery outcomes

🎯 Requirements

• 4+ years of experience in Site Reliability Engineering, DevOps, Platform Engineering, or Cloud Infrastructure • Strong hands-on experience operating AWS in production environments • Good expertise in Kubernetes, including cluster operations, troubleshooting, workload reliability, and platform administration • Experience with Kubernetes multi-tenancy, including namespaces, RBAC, quotas, policies, and tenant isolation patterns • Experience implementing and operating ArgoCD within a GitOps delivery model • Strong hands-on experience with Helm • Experience with Terraform/Terragrunt for infrastructure provisioning and environment management • Solid scripting and automation skills using Bash and/or Python • Experience building, maintaining, or supporting CI/CD pipelines, ideally using GitHub Actions • Strong troubleshooting skills across Linux, containers, IAM, networking, and distributed systems • Experience with monitoring, alerting, and observability in production environments • Demonstrated ownership mindset with experience handling incidents and resolving production issues • Strong collaboration and communication skills, with the ability to work effectively across engineering, security, and platform teams • Bachelor’s degree in computer science, engineering, a related field or equivalent experience • Demonstrated ability to use AI to improve speed and quality in your day-to-day workflow for relevant outputs • Strong track record of critical evaluation and verification of AI-assisted work (e.g., testing, source-checking, data validation, peer review) • High integrity and ownership: you protect sensitive data, avoid over-reliance on AI, and remain accountable for final decisions and deliverables.

🏖️ Benefits

• Information regarding the culture at Pinterest and benefits available for this position can be found [here](https://www.pinterestcareers.com/pinterest-life/)

Apply Now

Similar Jobs

🔥 22 minutes ago

YipitData

201 - 500

💸 Finance

🏢 Enterprise

DevSecOps Lead managing secure software development lifecycle at YipitData. Collaborating across departments to strengthen security practices within engineering operations.

Cloud

Jenkins

SDLC

🔥 55 minutes ago

YipitData

201 - 500

💸 Finance

🏢 Enterprise

DevSecOps Lead building secure software development lifecycle and vulnerability management at YipitData. Leading cross-functional collaboration to implement security standards across software development.

Cloud

Jenkins

SDLC

🔥 2 hours ago

Aledade, Inc.

501 - 1000

⚕️ Healthcare Insurance

🏢 Enterprise

Salesforce DevOps Analyst ensuring quality and reliability of Salesforce solutions. Collaborating with teams for test strategies, automation, and CI/CD management.

Cloud

Cyber Security

Jenkins

Selenium

🔥 12 hours ago

Guidehouse

10,000+ employees

Site Reliability Engineer collaborating with teams to establish SRE practices and participate in system design reviews at Guidehouse. Focused on AWS cloud infrastructure and promoting automation.

Ansible

AWS

Azure

Cloud

Linux

Packer

Python

SDLC

Terraform

🔥 18 hours ago

EverCommerce

1001 - 5000

☁️ SaaS

🤝 B2B

🛍️ eCommerce

Lead DevOps Engineer at EverCommerce modernizing cloud infrastructure and deployment pipelines. Collaborating with teams for a seamless developer experience and best practices in security and compliance.

Ansible

AWS

Cloud

Terraform