
51 - 200 employees
A multiplayer computer for creating and sharing software. Get AI to write code for you with our new Generate Code feature.
đź•’ May 20
Improve your chances of getting an interview by checking your resume score before you apply.

51 - 200 employees
A multiplayer computer for creating and sharing software. Get AI to write code for you with our new Generate Code feature.
• Join our Site Reliability Engineering (SRE) team to ensure reliability, scalability, and performance of Replit's infrastructure. • Bridge the gap between development and operations, implementing automation and best practices. • Proactively find and analyze reliability problems. • Design robust observability solutions, lead incident response, automate operational tasks. • Educate and mentor the broader engineering team to make reliability a core value at Replit.
• 8-10 years of experience in Site Reliability Engineering or similar roles (e.g., DevOps, Systems Engineering, Infrastructure Engineering). • Strong programming skills in languages like Python or Go. • Deep understanding of distributed systems. • Deep experience with container orchestration platforms, specifically Kubernetes, and cloud-native technologies. • Proven track record of designing, implementing, and maintaining sophisticated monitoring and observability solutions. • Strong incident management skills with extensive experience leading incident response for complex systems. • Experience with infrastructure as code (e.g., Terraform, Pulumi) and configuration management tools. • Excellent written and verbal communication skills. • Strong interpersonal skills, with experience working with and mentoring engineers from junior to principal levels. • A willingness to dive into understanding, debugging, and improving any layer of the stack.
• Competitive Salary & Equity • 401(k) Program with a 4% match • Health, Dental, Vision and Life Insurance • Short Term and Long Term Disability • Paid Parental, Medical, Caregiver Leave • Commuter Benefits • Monthly Wellness Stipend • Autonomous Work Environment • In Office Set-Up Reimbursement • Flexible Time Off (FTO) + Holidays • Quarterly Team Gatherings • In Office Amenities
Apply Nowđź•’ April 21
DevOps specializing in Ruby on Rails for a fully remote taxi ordering service. Design and operate scalable systems, ensuring high availability and performance with a collaborative team.
🇪🇺 Europe – Remote
⏰ Full Time
đźź Senior
đź”´ Lead
⛑ DevOps & Site Reliability Engineer (SRE)
đź•’ February 17
Infrastructure/DevOps Engineer responsible for managing AWS and Kubernetes at Thrill Labs. Working on high-scalability projects and improving security measures in a fast-growing tech startup.
🇪🇺 Europe – Remote
⏰ Full Time
đźź Senior
đź”´ Lead
⛑ DevOps & Site Reliability Engineer (SRE)