Senior Site Reliability Engineer

Job not on LinkedIn

22 hours ago

Apply Now
Logo of Offensive Security

Offensive Security

Cybersecurity ‱ Education

Offensive Security is a leading provider of cybersecurity training and professional education. With a focus on continuous workforce development, they offer hands-on, real-world training through live-fire simulations and cyber ranges. Their offerings include industry-leading certifications and learning paths designed to develop critical cybersecurity skills for the most in-demand job roles. Offensive Security is recognized for its rigorous training content and labs and provides unique training solutions for organizations across various sectors, including public sector agencies. Their learning platform emphasizes skill development to build cyber workforce resilience and prepare individuals and teams for real-world cyber challenges.

201 - 500 employees

Founded 2006

🔒 Cybersecurity

📚 Education

💰 Private Equity Round on 2018-09

📋 Description

‱ Design and architect complex global data centers for labs supporting vulnerable machines and realistic attack scenarios using OpenStack ‱ Develop scalable infrastructure solutions across hybrid cloud and on-premises environments ‱ Design secure hosting networks and network topologies that can be used to support realistic offensive cyber activities. ‱ Establish infrastructure standards, patterns, and best practices for lab environment deployment ‱ Create architectural solutions that reduce infrastructure costs while improving capabilities and performance ‱ Implement network isolation for thousands of concurrent user lab instances ‱ Optimize lab deployment speed and resource utilization for peak performance ‱ Create infrastructure supporting the deployment of concurrent vulnerable machine instances at scale ‱ Design workspace-based deployment models enabling team collaboration and private lab sessions ‱ Partner closely with Lead Platform and Content Engineers to proactively identify and solve infrastructure requirements ‱ Provide strategic technical guidance and mentorship to development and operations teams ‱ Lead architectural reviews and challenge requirements to propose optimal technical solutions ‱ Drive adoption of infrastructure-as-code and automated deployment practices ‱ Identify process improvements and optimization opportunities before being asked ‱ Develop infrastructure automation using known Infrastructure as Code frameworks ‱ Create self-service capabilities for Content Engineers to deploy and manage lab resources efficiently ‱ Implement comprehensive monitoring, logging, and observability solutions for lab environments ‱ Establish disaster recovery and business continuity procedures with minimal downtime requirements ‱ Automate repetitive tasks to help reduce Toil ‱ Optimize application and infrastructure performance though automation and tuning ‱ Writes runbooks to automate repetitive tasks using Ansible and Terraform ‱ Serves as a knowledge resource for the rest of the team on Ansible and Terraform ‱ Evaluates new and emerging products, technologies and make recommendations concerning the introduction of new technologies ‱ Conducts ongoing research into relevant technology stacks and architectural patterns, assessing their potential impact and value for internal use ‱ Assists in monitoring performance to address errors and address bottlenecks ‱ Respond to and resolve infrastructure incidents and outages ‱ Participate in on-call rotations to ensure service reliability ‱ Design complex network architectures including VPNs, VLANs, and software-defined networking ‱ Implement network segmentation and security controls appropriate for vulnerable lab environments ‱ Configure and manage load balancers, firewalls, and network security appliances ‱ Design network monitoring and traffic analysis capabilities ‱ Ensure proper isolation between student lab environments while maintaining performance

🎯 Requirements

‱ OpenStack: Production experience with OpenStack deployment, management, and optimization ‱ Cloud Platforms: 5+ years hands-on experience with AWS, Azure, and Google Cloud Platform ‱ Virtualization: Expert-level knowledge of OpenStack ‱ Networking: Deep understanding of TCP/IP, routing protocols, VPNs, firewalls, and network security ‱ Infrastructure as Code: Proficiency with any framework like Terraform, CloudFormation, ARM templates, and configuration management tools ‱ Containerization: Experience with Docker, Kubernetes or other container orchestration ‱ Operating Systems: Advanced knowledge of Linux and Windows Server ‱ 4+ years of experience in SRE, Site Reliability Engineering, or Infrastructure Architecture roles ‱ 2+ years in a senior or lead technical role with architectural responsibilities ‱ Proven track record of designing and implementing large-scale, distributed systems ‱ Demonstrated experience with infrastructure cost optimization and migration projects ‱ Experience with high-availability and disaster recovery implementations ‱ Background in cybersecurity, penetration testing, or vulnerability research environments (preferred but not a requirement)

đŸ–ïž Benefits

‱ Flexible work arrangements ‱ Professional development opportunities

Apply Now

Similar Jobs

November 21

Site Reliability Engineer maintaining a reliable, scalable, and observable email platform for developers. Involves collaborating with teams to build automation and improve observability across systems.

AWS

Distributed Systems

Grafana

JavaScript

Node.js

React

November 14

DevOps Engineer collaborating with product teams on user stories and UI validation. Working remotely across Europe for an innovative internal developer platform while contributing to Open Source.

Ansible

AWS

Azure

Cloud

Docker

ElasticSearch

Google Cloud Platform

Grafana

Kubernetes

Linux

MySQL

Open Source

Prometheus

Python

Redis

Terraform

Go

November 6

Senior DevOps Engineer focusing on Overleaf's infrastructure. Join Digital Science to advance the research ecosystem with innovative technology.

AWS

Cloud

Docker

Google Cloud Platform

Grafana

JavaScript

Jenkins

Kubernetes

Linux

Postgres

Python

Redis

SQL

Terraform

Unix

November 3

DevOps Engineer responsible for managing ENS infrastructure for blockchain resolution. Overseeing metadata services, CI/CD, and cloud infrastructure with a focus on security.

Ansible

AWS

Cloud

Docker

Google Cloud Platform

Grafana

Kubernetes

Node.js

Postgres

Prometheus

Python

Redis

Terraform

TypeScript

Go

October 29

DevOps Engineer helping scale and automate infrastructure for a fintech company. Ensuring resilience, observability, and security of critical systems in a high-availability production environment.

Ansible

AWS

Azure

Cloud

DNS

Docker

Firewalls

Google Cloud Platform

Grafana

Jenkins

Kubernetes

Prometheus

Python

TCP/IP

Terraform

Go

Built by Lior Neu-ner. I'd love to hear your feedback — Get in touch via DM or support@remoterocketship.com