Senior Manager, Site Reliability Engineer

Job not on LinkedIn

August 30

Apply Now
Logo of Eclipse Foundation

Eclipse Foundation

Open Source • Software Development

Eclipse Foundation is an international non-profit association that serves as a community-driven, open source platform for innovation and collaboration. It is well known for being home to the Eclipse IDE, Jakarta EE, and a plethora of open source projects including development frameworks, tools, cloud, edge, automotive, and IoT technologies. Supported by industry leaders, the foundation promotes open source as a pivotal component of business strategies, offering a scalable and vendor-neutral environment for software development. The Eclipse Foundation facilitates collaborations among its members, which include individual developers and organizations, across a wide array of industries, fostering an ecosystem that values open source for business transformation and technological progression.

11 - 50 employees

Founded 2004

📋 Description

• Architect and manage Kubernetes deployments for Open VSX in production environments • Oversee PostgreSQL and ElasticSearch clusters, ensuring data integrity, performance, and scalability • Implement and refine monitoring, alerting, and incident response systems to maintain high service reliability • Collaborate with development teams to improve CI/CD pipelines and deployment workflows • Partner with the Security team to implement and uphold organisational policies and secure-by-design practices • Lead root cause analysis and postmortems for service disruptions, driving continuous improvement • Provide technical leadership and mentorship to junior operations staff • Engage with the community and users to resolve support issues and gather feedback • Maintain documentation and contribute to operational playbooks • Define and report on service KPIs, SLOs, and operational health indicators • Provide strategic advice to leadership on platform operations and technology decisions • Contribute to annual planning cycles by informing resource needs, tooling requirements, and infrastructure budgeting

🎯 Requirements

• 5+ years of experience in site reliability engineering, DevOps, or IT operations • Deep expertise in Kubernetes, Helm, and container orchestration • Strong experience with PostgreSQL and ElasticSearch in production environments • Proficiency in monitoring and observability tools (e.g., Prometheus, Grafana, ELK stack) • Solid scripting and automation skills (e.g., Bash, Python, Ansible) • Familiarity with GitHub Actions or similar CI/CD tools • Excellent troubleshooting skills and a proactive mindset • Ability to work independently in a remote, multicultural team • Bonus: experience supporting open source infrastructure or registries • Excellent communication skills

🏖️ Benefits

• Friday flex-time • right-to-disconnect policy • Corporate Recharge days • competitive compensation • comprehensive benefits package

Apply Now

Similar Jobs

July 30

team.blue

1001 - 5000

☁️ SaaS

🛍️ eCommerce

Join team.blue as a Linux DevOps engineer to maintain and scale the BlueStack platform across Europe.

🇧🇪 Belgium – Remote

⏰ Full Time

🟡 Mid-level

🟠 Senior

⛑ DevOps & Site Reliability Engineer (SRE)

June 27

Blackfluo.ai

2 - 10

🤖 Artificial Intelligence

🎯 Recruiter

☁️ SaaS

Join Blackfluo.ai to develop remote infrastructure solutions for cloud computing and AI technologies.

🇧🇪 Belgium – Remote

⏰ Full Time

🟠 Senior

⛑ DevOps & Site Reliability Engineer (SRE)

Developed by Lior Neu-ner. I'd love to hear your feedback — Get in touch via DM or support@remoterocketship.com