Site Reliability Engineer

Job not on LinkedIn

September 24

Apply Now
Logo of Sophos

Sophos

Cybersecurity • SaaS

Sophos is a leading cybersecurity company that specializes in protecting businesses against advanced cyber threats. The company offers a comprehensive suite of security solutions, including endpoint protection, managed detection and response (MDR), network security, and cloud security. With a prevention-first approach, Sophos aims to stop ransomware and other cyber threats before they cause harm. Sophos provides services such as threat research, security training, and operational support to ensure robust defense against cyberattacks. Their solutions cater to various industries including finance, healthcare, government, manufacturing, and retail. The Sophos Central platform delivers centralized security management, integrating seamlessly with existing IT infrastructure to enhance security posture.

1001 - 5000 employees

Founded 1985

🔒 Cybersecurity

☁️ SaaS

💰 Post-IPO Equity on 2021-08

📋 Description

• Carry personal responsibility for high quality changes to a live IT / SaaS infrastructure • Manage key operational systems on which millions of customers depend • Create estimates and work plans then deliver according to those plans in an agile manner • Participate in a weekday / weekend on-call rotation • Occasionally carry out changes outside of normal working hours • Monitor existing infrastructure, identify areas of weakness and propose improvements • Take corrective action, as a first point of contact, to resolve incidents escalated from Technical Support, IT and development communities and escalate where appropriate • Provide incident response and coordination, ensuring issue resolution and helping co-ordinate multiple resources • Publish software updates as required meeting product release schedules, monitoring load on infrastructure and support, taking corrective action where possible and flagging issues with management

🎯 Requirements

• Minimum of 2 years with AWS in Production: Hands-on experience operating live systems in Amazon Web Services (AWS), including EC2, ECS/EKS, S3, IAM (roles and policies), CloudWatch, SQS, SNS, Systems Manager (SSM), and Secrets Manager. • Excellent Communication & Documentation: Collaborates and communicates affectively with all team members; creates, reviews, and maintains clear documentation. • Linux Fluency: Command-line configuration and troubleshooting in cloud-based Linux environments. • Incident Management and Change Control: Leads incident bridges; communicates status, impact, risk, and next actions clearly; confirms and documents approvals before executing changes; records actions and outcomes for auditability. • On-Call Experience: Open to or brings prior participation in structured on-call rotations, including weekday daytime/evening coverage and periodic weekends. • Continuous Integration and Delivery (CI/CD): Familiarity with pipeline and release processes using tools such as GitHub Actions or Jenkins. • Monitoring and Alerting: Uses dashboards and alerts to triage and resolve issues with tools such as CloudWatch, PagerDuty, Datadog, or Logz.io (or equivalent). • Networking Fundamentals: Understanding of IP, DNS, routing, and load balancing across Local Area Networks (LAN) and Wide Area Networks (WAN). • Ways of Working: Experience in ITIL and Agile environments and adherence to structured change control; collaborates effectively across global time zones; learns new technologies quickly. • Scripting for Operations: Practical Bash and Python for automation, investigation, and remediation. • Nice-to-Haves: Infrastructure as Code (Terraform or equivalent) and familiarity with Git-based workflow automation. • Nice-to-Haves: Containers and Orchestration experience (ECS today and EKS/Kubernetes for migration). • Nice-to-Haves: Databases exposure (MongoDB and Amazon RDS MySQL/PostgreSQL) for operational support and triage. • Nice-to-Haves: Security awareness of malware, spam, and network-threat concepts. • Nice-to-Haves: Compliance exposure (SOC 2, FedRAMP, or similar frameworks). • Nice-to-Haves: Experience using generative AI tools such as GitHub Copilot or Cursor to assist with documentation and scripting. • Legal authorization to work in the jurisdiction where the position is posted, without requiring employer sponsorship.

🏖️ Benefits

• Bonus eligibility • Comprehensive benefits package • Remote-first working model (remote work primary option) • Employee-led diversity and inclusion networks • Annual charity and fundraising initiatives and volunteer days • Global employee sustainability initiatives to reduce our environmental footprint • Global fitness and trivia competitions • Global wellbeing days for employees to relax and recharge • Monthly wellbeing webinars and training to support employee health and wellbeing

Apply Now

Similar Jobs

September 19

DevOps Engineer building scalable cloud and CI/CD infrastructure for Veeva Systems' life sciences SaaS. Focus on IaC, automation, Kubernetes, Terraform, and reliability.

Ansible

AWS

Cloud

Distributed Systems

Docker

Java

Jenkins

Kubernetes

OpenShift

Python

Scala

Terraform

Go

September 17

Release Engineer delivering CRM SaaS releases and environments at Veeva Systems. Coordinating deployments, resolving deployment issues, and supporting internal teams.

AWS

Cloud

EC2

iOS

Java

Jenkins

Kubernetes

Linux

MySQL

Python

SDLC

Unix

September 16

DevOps Engineer building scalable AWS infrastructure, CI/CD, and containerized deployments for Veeva's life sciences cloud; focuses on automation, reliability, and mentorship.

Ansible

AWS

Cloud

Distributed Systems

Docker

Java

Jenkins

Kubernetes

OpenShift

Python

Scala

SQL

Terraform

Go

September 10

DevOps Engineer building scalable cloud infrastructure at Veeva Systems. Ensuring reliable, automated delivery of SaaS products for life sciences customers.

Ansible

AWS

Cloud

Distributed Systems

Docker

Java

Jenkins

Kubernetes

OpenShift

Python

Scala

SQL

Terraform

Go

August 29

Senior SRE focused on data infrastructure and security at Kraken; builds scalable lakehouse, streaming pipelines, and RBAC-enabled platforms.

Airflow

Apache

AWS

Cloud

Docker

Kafka

Kubernetes

Python

Shell Scripting

Spark

Terraform

Built by Lior Neu-ner. I'd love to hear your feedback — Get in touch via DM or support@remoterocketship.com