
Non-profit • Education • Media
Wikimedia Foundation is a nonprofit charitable organization dedicated to the growth, development, and distribution of free, multilingual content. It provides the essential infrastructure for free knowledge, including hosting Wikipedia, the free online encyclopedia that is created, edited, and verified by a global community of volunteers. Supported primarily through donations, Wikimedia Foundation promotes collaborative projects that aim to share knowledge reflecting human diversity and strives to protect everyone's right to access free and open knowledge.
July 10
🇺🇸 United States – Remote
💵 US$132.4k - US$208.4k / year
⏰ Full Time
🟡 Mid-level
🟠 Senior
⛑ DevOps & Site Reliability Engineer (SRE)
🦅 H1B Visa Sponsor

Non-profit • Education • Media
Wikimedia Foundation is a nonprofit charitable organization dedicated to the growth, development, and distribution of free, multilingual content. It provides the essential infrastructure for free knowledge, including hosting Wikipedia, the free online encyclopedia that is created, edited, and verified by a global community of volunteers. Supported primarily through donations, Wikimedia Foundation promotes collaborative projects that aim to share knowledge reflecting human diversity and strives to protect everyone's right to access free and open knowledge.
•Managing one to two globally distributed teams within Wikimedia’s Site Reliability Engineering organization •Providing guidance, mentorship, and support to ensure the team's effectiveness and growth •Working with team members to set individual performance goals, and supporting them in meeting and evolving their goals and career path •Recruiting, hiring, and helping onboard new team members •Triaging incoming workload, maintaining focus on priorities, and setting realistic expectations for both peers and team members •Coordinating and communicating with other members of the Wikimedia product & engineering teams on relevant projects, executing complex projects and contributing to the organizational strategy •Continuously developing the roadmap of the team in alignment with other SRE and Product & Technology teams, and helping to draft and execute the team’s annual and quarterly plans •Project managing new and existing initiatives •Leading the definition, refinement, and execution of the processes through which the team manages and performs work •Leading incident response, diagnosis, and follow-up on system alerts and outages across Wikimedia’s production infrastructure •Be part of 24/7 on-call rotation to handle escalations and provide support for teams to resolve issues •Facilitating the definition and establishment of Service Level Indicators and Objectives with service owners and stakeholders
•Prior experience managing teams •Prior hands-on experience with software or reliability engineering (within the last 3 years preferred) •Ability to analyze complex systems, troubleshoot issues, and devise effective solutions under pressure •Proficiency in project management methodologies to effectively plan, execute, and track new and existing initiatives •Strong understanding of cloud computing, networking, Linux systems administration, containerization (e.g., Docker, Kubernetes), and infrastructure as code (e.g., Terraform, Ansible) to be able to provide technical support to the team •Aptitude for automation and streamlining of tasks •Communicate effectively in both spoken and written English •Ability to work independently, as an effective part of a globally distributed team •Ability to travel several times a year for occasional in-person meetings •B.S. or M.S. in Computer Science or the equivalent in related work experience
•U.S. Benefits & Perks
Apply NowJuly 9
51 - 200
Join Tekmetric as a Site Reliability Engineer to manage reliable cloud infrastructure and enhance system performance.
🇺🇸 United States – Remote
💰 Venture Round on 2022-03
⏰ Full Time
🟡 Mid-level
🟠 Senior
⛑ DevOps & Site Reliability Engineer (SRE)
AWS
Cloud
Docker
Google Cloud Platform
Grafana
Java
JavaScript
Kubernetes
Prometheus
Python
Terraform
Go
July 8
Join Intermedia as a DevOps Engineer to deploy and maintain application infrastructure and collaborate with development teams.
🇺🇸 United States – Remote
💰 Venture Round on 2017-02
⏰ Full Time
🟡 Mid-level
🟠 Senior
⛑ DevOps & Site Reliability Engineer (SRE)
Ansible
AWS
Cloud
Docker
ElasticSearch
ETL
Jenkins
Kubernetes
Linux
MySQL
Python
RabbitMQ
Redis
Go
July 6
Senior Site Reliability Engineer managing GCP infrastructure and DevOps practices. Help reduce wildfire risks using advanced technology.
Cloud
Google Cloud Platform
Kubernetes
Unix
July 4
Join Resonance as a DevOps Engineer to build and maintain an AI-driven platform for fashion.
🇺🇸 United States – Remote
💰 Venture Round on 2020-10
⏰ Full Time
🟠 Senior
⛑ DevOps & Site Reliability Engineer (SRE)
🦅 H1B Visa Sponsor
Airflow
AWS
Cloud
Distributed Systems
Docker
EC2
Grafana
GraphQL
Jenkins
Kafka
Kubernetes
Microservices
NoSQL
Prometheus
Terraform
July 3
Join MetaRouter as a Senior Site Reliability Engineer to enhance critical infrastructure operations. Experience with cloud environments and SRE practices required.
🇺🇸 United States – Remote
💵 $140k - $180k / year
⏰ Full Time
🟠 Senior
⛑ DevOps & Site Reliability Engineer (SRE)
Cloud
Docker
Google Cloud Platform
JavaScript
Kubernetes
Node.js
Prometheus
React
Terraform
Go