
1001 - 5000 employees
Founded 2004
Yelp is a platform that connects consumers with local businesses, allowing users to discover and review a wide variety of services including restaurants, home services, and automotive services. It aims to help consumers find trusted recommendations for goods, services, and experiences in their local area, while offering business owners tools to manage customer interactions and promote their offerings.
🔥 20 hours ago
🇨🇦 Canada – Remote
💵 $135k - $185k / year
⏰ Full Time
🟡 Mid-level
🟠 Senior
⛑ DevOps & Site Reliability Engineer (SRE)
Ansible
AWS
Chef
Cloud
Distributed Systems
DNS
Docker
Google Cloud Platform
Grafana
Java
Jenkins
Kubernetes
Linux
Open Source
Prometheus
Puppet
Python
Ruby
Rust
Splunk
TCP/IP
Terraform
TypeScript
Go
Improve your chances of getting an interview by checking your resume score before you apply.

1001 - 5000 employees
Founded 2004
Yelp is a platform that connects consumers with local businesses, allowing users to discover and review a wide variety of services including restaurants, home services, and automotive services. It aims to help consumers find trusted recommendations for goods, services, and experiences in their local area, while offering business owners tools to manage customer interactions and promote their offerings.
• Build and manage scalable, self-healing, globally-distributed systems. • Keep Yelp fast, available, and growing. • Implement key parts of the core architecture and support developers. • Empower Yelp: spinning up infrastructure should always be a git commit and a code review away, with automation and self-service being at the core of what we do. • Troubleshoot site issues using industry-leading tools like Splunk, Grafana, and Prometheus. • Automate everything with Python, Puppet, Git, Jenkins, Terraform and more! • Develop custom tools, when off-the-shelf solutions don’t work at our scale and contribute upstream to open source projects. • Design and implement new systems, tests, and procedures. • Participate in light on-call rotations - we have geographically distributed SRE teams for follow-the-sun support.
• Mastery of Linux (we use Ubuntu but any distro is fine) • Command of your favorite modern programming language to appreciate delivering safe and secure services: Python, Typescript, Ruby, Go, Rust, Java, C++, etc. • A solid understanding of Internet fundamental technologies in delivering services on the Internet (TCP/IP, HTTP, DNS, etc). • Experience with public cloud platforms (we use AWS and GCP, but others are also fine) and related tooling (Terraform, Puppet, Chef, Ansible etc.). • Experience with Linux containerisation and orchestration (e.g., Docker, Podman and Kubernetes). • Self-motivated to investigate, fix and improve Yelp in an ever changing environment. • Leading, Collaborating and Sharing technical activities with global teams. • Own the total lifecycle of a system.
• Health insurance • Retirement plans • Paid time off • Flexible work arrangements • Professional development
Apply Now🕒 4 days ago
Agentic AI Forward Deployment Engineering Lead at Netomi transforming enterprise customer requirements into production-grade AI solutions. Collaborating with teams to ensure successful deployments and measurable business outcomes.
🇨🇦 Canada – Remote
💰 $30M Series B on 2021-11
⏰ Full Time
🟠 Senior
⛑ DevOps & Site Reliability Engineer (SRE)
Distributed Systems
🕒 5 days ago
Site Reliability Engineer enhancing incident response and engineering practices for Vista's reliability. Focused on identifying failure patterns and implementing proactive improvements for operational excellence.
🇨🇦 Canada – Remote
💵 $104k - $143k / year
💰 $40M Venture Round on 2003-07
⏰ Full Time
🟠 Senior
⛑ DevOps & Site Reliability Engineer (SRE)
AWS
Azure
Cloud
Grafana
Java
Python
TypeScript
Go
🕒 6 days ago
SRE / Network Engineer focused on Metal-as-a-Service and bare-metal automation for innovative cloud infrastructure. Supporting core infrastructure systems and scalable networks in a remote environment.
Ansible
Grafana
Linux
OpenStack
Prometheus
Python
VMware
🕒 June 17
SRE / Network Engineer working remotely for a European deep-tech cloud computing company. Responsible for maintaining infrastructure systems and automating processes across distributed environments.
Ansible
Cloud
Grafana
Linux
OpenStack
Prometheus
Python
VMware
🕒 June 16
DevOps Engineer at Intrahealth working on Kubernetes and CI/CD for healthcare data solutions. Focused on AI-augmented development and collaboration with global teams.
🇨🇦 Canada – Remote
💵 $130k - $150k / year
⏰ Full Time
🟠 Senior
⛑ DevOps & Site Reliability Engineer (SRE)
AWS
Azure
Cloud
DNS
Flux
Google Cloud Platform
Grafana
Kubernetes
Prometheus
Python
Terraform
Go