Manager, Reliability Engineering

Job not on LinkedIn

🔥 0 minutes ago

Apply Now
Find Similar Remote Jobs

📊 Check your resume score for this job

Improve your chances of getting an interview by checking your resume score before you apply.

Logo of Kohl's

Kohl's

10,000+ employees

Founded 1962

💄 Beauty

⚕️ Healthcare Insurance

🧘 Wellness

Beauty • Healthcare Insurance • Wellness

Kohl's is a skincare solution brand that emphasizes science in its product development. It focuses on creating effective skincare products that are formulated with proven ingredients and cutting-edge technology, like QuSome®, to enhance and rejuvenate the skin. Committed to addressing a variety of skin issues, Kohl's offers targeted treatments designed to achieve healthy and beautiful skin while ensuring high standards of effectiveness with no unnecessary extras.

📋 Description

• Lead and mentor a team of reliability engineers to drive operational excellence across Kohl’s distributed systems • Develop and implement strategies, collaborate closely with engineering teams and ensure SRE best practices are embedded throughout the software development lifecycle • Conduct design reviews, implement robust monitoring and alerting and establish auto-healing practices • Provide leadership and guidance during critical incidents to triage, troubleshoot and resolve complex issues • Drive comprehensive root cause analysis and follow-through on preventative measures • Manage the software lifecycle, driving reliability, observability and efficiency in collaboration with peers across Design, Product Management, and Engineering • Lead major automation and toil reduction initiatives, simplifying the ecosystem and reducing risks • Set the vision and drive cultural transformation within the team • Coach team through empathy and hands-on mentoring • Develop and deliver training programs to upskill the team and broaden SRE adoption across the organization

🎯 Requirements

• Bachelor's Degree or equivalent in MIS, Computer Science or related field • 6+ years of experience in software development and 2+ years of progressive leadership experience, mentoring diverse teams • Advanced in-depth knowledge of application design patterns, event-driven architecture, database schemas and testing strategies • Demonstrated knowledge of systems architecture, operating system internals and networking • Proven experience with multi-region application troubleshooting and performance tuning • Demonstrated experience working with (at least one) cloud platform (GCP, AWS, or Azure) and a hybrid cloud environments • Advanced in-depth knowledge and experience with continuous integration, continuous deployment and test-driven development • Strong programming skills in one or more languages (Java, Python, Go or Node.js)

🏖️ Benefits

• Health insurance • Professional development • Flexible work arrangements

Apply Now

Similar Jobs

🔥 2 hours ago

Tern

11 - 50

💳 Fintech

🤝 B2B

💸 Finance

Site Reliability Engineer overseeing infrastructure migration from Heroku to GCP at Tern. Ensuring production reliability and operational excellence in a rapidly growing travel tech company.

🔥 8 hours ago

The Amatriot Group

201 - 500

🎯 Recruiter

🏛️ Government

🔒 Cybersecurity

DevSecOps Engineer designing and operating enterprise CI/CD pipelines at Amatriot Group. Integrating security/compliance controls and maintaining documentation for large-scale environments.

🔥 13 hours ago

Applied Research Solutions

501 - 1000

🏛️ Government

🔒 Cybersecurity

Senior DevOps Engineer responsible for cloud application administration and integration engineering. Collaborating with cross-functional teams to ensure seamless data flow and architecture.

🔥 16 hours ago

EXL

10,000+ employees

Forward Deployment Engineer responsible for deploying EXLdata.ai in client cloud environments. Collaborating with client teams to ensure successful deployment and adoption.

🇺🇸 United States – Remote

💵 $130k - $150k / year

💰 $2M Venture Round on 2015-01

⏰ Full Time

🟡 Mid-level

🟠 Senior

⛑ DevOps & Site Reliability Engineer (SRE)

🔥 18 hours ago

AuthZed

11 - 50

🔌 API

🔒 Cybersecurity

☁️ SaaS

Site Reliability Engineer responsible for maintaining systems reliability and performance at AuthZed. Collaborate globally while developing scalable infrastructure solutions for a cutting-edge authorization platform.