Principal Production Engineer

🔥 0 minutes ago

Apply Now
Find Similar Remote Jobs

📊 Check your resume score for this job

Improve your chances of getting an interview by checking your resume score before you apply.

Logo of Canva

Canva

1001 - 5000 employees

Founded 2013

☁️ SaaS

📱 Media

📚 Education

💰 $200M Venture Round on 2021-09

SaaS • Media • Education

Canva is a versatile online design platform that empowers users to create a wide range of professional designs with ease. From social media posts and presentations to business cards and posters, Canva provides thousands of templates and design tools to help users bring their creative ideas to life. The platform also offers a suite of AI-powered features to enhance creativity and productivity, including tools like Magic Write for copy generation and Magic Edit for photo transformations. Canva caters to individuals, teams, and enterprises, making it an ideal solution for collaborative design and workflow management. It is also committed to sustainability and social impact, offering free educational and nonprofit access to its premium features.

📋 Description

• Join the team redefining how the world experiences design. • The Production Engineering team sits at the intersection of software engineering and the hardest reliability problems in Canva's infrastructure. • Writing software. Changing how production behaves. When it works, every team ships with more confidence and Canva gets faster and more resilient for the people who use it every day. • The strategic bet is a different model entirely. Canva's own take on what production reliability looks like, built for how we work. • Not operationalising systems. Not running alerts. Writing software that changes how production behaves. • Leading the hardest engagements: Taking personal ownership of the most technically complex areas, sharding, multi-region architecture, JVM performance at scale, while the team builds depth in adjacent domains. • Setting the technical bar: What it means to be a production engineer at Canva. The standard for technical credibility. The archetype that future hiring calibrates against. • Pairing strategy across the team: Deciding how staff and mid-level engineers are paired and what they should be learning from each engagement. • Building the measurement story: Incident severity and duration trending down. Feature launches going to production cleanly. You define what the metrics are and how they're tracked.

🎯 Requirements

• Experience Production at scale: You've owned reliability in large-scale distributed systems. When things went brake, you investigate how and shipped the solution that lasts forever. • Technical leadership in embedded models: You've led or helped shape a function where engineers work across team boundaries rather than within a single one. You know what makes that model work and what makes it fail. • Hands-on through seniority: You've stayed close to the code. At this level, you're the engineer others consult when the problem is genuinely hard. • Cross-org influence: You've shaped how teams outside your own make technical decisions because your technical judgement is trusted. • JVM or systems depth: You've built real things in Java, Go, Rust, C++, or a comparable systems language at production scale. Language matters less than depth. • Distributed systems in practice: You've navigated sharding, replication, failure modes, and consistency tradeoffs in production. • Technical knowledge Linux internals: You can reason about process scheduling, memory, I/O, and the network stack when a system misbehaves. • Distributed systems: You've navigated sharding, replication, failure modes, and consistency tradeoffs in production. As well as consistent hashing, leader election, consensus, backpressure, circuit breakers • Observability tooling: You've built the tracing, dashboards, and alerting that tells you what's wrong. • Containerisation and orchestration: Kubernetes in production, at the scheduler level. • Performance analysis: You've profiled JVM applications or systems-level processes and fixed what you found. • Cloud infrastructure: AWS in production, across the failure modes that matter at scale. • Incident response: You've been on-call and have opinions about what good looks like. • Nice to have Enterprise SaaS background: You've done this specific kind of work at an org that's done it well. You know what 'production engineering' means when it's not just a job title. • JVM internals: You've tuned GC and profiled threads in production. • Multi-region or sharding experience: You've been involved in a data store migration or multi-region architecture where getting it wrong was not an option.

🏖️ Benefits

• Equity packages — we want our success to be yours too • Inclusive parental leave policy that supports all parents & carers • An annual Vibe & Thrive allowance to support your wellbeing, social connection, office setup & more • Flexible leave options that empower you to be a force for good, take time to recharge and supports you personally

Apply Now