Senior Site Reliability Engineer

đŸ”„ 0 minutes ago

Apply Now
Find Similar Remote Jobs

📊 Check your resume score for this job

Improve your chances of getting an interview by checking your resume score before you apply.

Logo of Akamai Technologies

Akamai Technologies

5001 - 10000 employees

🔒 Cybersecurity

💰 Post-IPO Equity on 2001-07

Cloud Computing ‱ Cybersecurity ‱ Content Delivery

Akamai Technologies is a leading cloud services provider that specializes in delivering security, cloud computing, and content delivery solutions. It offers a range of services such as API security, DDoS protection, and performance optimization for web applications, ensuring secure and reliable user experiences. With a robust global infrastructure, Akamai empowers businesses to streamline their digital presence while safeguarding against various cyber threats and enhancing application performance.

📋 Description

‱ Owning the SRE infrastructure lifecycle from design reviews and pre-rollout readiness assessments through production sign-off and ongoing reliability management ‱ Designing and implementing frameworks that reflect customer experience for load balancing services and driving action when error budgets are at risk ‱ Building and maintaining observability pipelines from load-balancing components and system-level sources to dashboards that enable rapid incident triage ‱ Leading technical incident response for complex NB/NLB failures, acting as the technical commander and driving root cause analysis and preventive follow-through ‱ Developing and automating safe deployment workflows for phased releases, including bake-period monitoring, feature flag management, and validation across global datacenter rollouts ‱ Reviewing design documents, product-requirement documents and producing actionable SRE input on operational risks, capacity implications, Day-2 concerns, and product strategy gaps ‱ Building automation and tooling using Python or Go that reduces operational toil and improves team-wide operational capability

🎯 Requirements

‱ 8+ years of experience in SRE, infrastructure engineering, or platform engineering, working with large-scale distributed systems ‱ Demonstrate deep expertise with Linux networking fundamentals and diagnosing at the packet level using tcpdump, netstat, and similar tools ‱ Have hands-on experience with L4/L7 load balancing technologies covering configuration, health checking, high availability, and failure modes at scale ‱ Show a track record of defining SLO/SLI frameworks, building observability platforms from scratch, and running incident management processes at scale ‱ Demonstrate expertise in Kubernetes and containerization at scale including workload scheduling, networking, resource management, and operating stateful or network-intensive workloads in a cluster environment ‱ Build automation and tooling using Python or Go, with infrastructure-as-code experience (SaltStack, Ansible, or Terraform) and deployment safety instincts.

đŸ–ïž Benefits

‱ healthcare ‱ RRSP ‱ company holidays ‱ vacation (in the form of PTO) ‱ sick time ‱ family friendly benefits including employee assistance program including a focus on mental and financial wellness

Apply Now

Similar Jobs

🕒 Yesterday

Borrowell

51 - 200

💳 Fintech

📚 Education

đŸ‘„ B2C

Senior DevOps Engineer designing and managing cloud infrastructure at Borrowell, a company helping Canadians with their finances. Collaborating with development, security, and QA teams to enhance service delivery.

🇹🇩 Canada – Remote

đŸ’” $100k - $150k / year

💰 Series C on 2021-02

⏰ Full Time

🟠 Senior

⛑ DevOps & Site Reliability Engineer (SRE)

🕒 3 days ago

Mozilla

501 - 1000

đŸ‘„ B2C

🔒 Cybersecurity

Senior Site Reliability Engineer establishing infrastructure to support Thunderbird’s privacy-respecting tools. Collaborates remotely with a distributed team across various time zones.

🕒 5 days ago

Minor Hotels Europe and Americas

10,000+ employees

đŸ‘„ B2C

Software Change Management Consultant supporting application migration projects using IBM’s DBB/Git/IDD Solutions. Guiding clients through the conversion process and providing migration expertise and training.

🇹🇩 Canada – Remote

đŸ’” $62.9k - $147.5k / year

💰 Post-IPO Equity on 2018-05

⏰ Full Time

🟠 Senior

🔮 Lead

⛑ DevOps & Site Reliability Engineer (SRE)

🕒 5 days ago

Clic Santé

11 - 50

☁ SaaS

đŸ›ïž Government

đŸ€ B2B

DevOps/DevSecOps managing cloud-native infrastructure on GCP, optimizing CI/CD and automation for a healthcare startup. Prioritizing security, performance, and resilience in a scalable environment.

đŸ—ŁïžđŸ‡«đŸ‡· French Required