Sep 18, 2023

Senior Software Reliability Engineer - Observability (remote across Aus & NZ)

  • Canva
  • Sydney, New South Wales, Australia
Information Technology

Job Description

Join the team redefining how the world experiences design.

Hey, g'day, mabuhay, kia ora,你好, hallo, vítejte!

Thanks for stopping by. We know job hunting can be a little time consuming and you're probably keen to find out what's on offer, so we'll get straight to the point. 

Where and how you can work

Our flagship campus is in Sydney. We also have a campus in Melbourne and co-working spaces in Brisbane, Perth and Adelaide. But you have choice in where and how you work. That means if you want to do your thing in the office (if you're near one), at home or a bit of both, it's up to you. 

What you’d be doing in this role

As Canva scales change continues to be part of our DNA. But we like to think that's all part of the fun. So this will give you the flavour of the type of things you'll be working on when you start, but this will likely evolve.

At the moment, this role is focused on:
- Being responsible for building and improving our observability platform and tooling with exceptional quality, at a regular cadence, that is used by all Canva engineers.
- Providing technical leadership and expertise to drive pragmatic solutions and dive into impactful design decisions.
- Brainstorming, researching, prototyping to optimize our logging platform and operational effectiveness and reliability
- Being proactive in improving the user experience with logging and advocating for best practices
- Participating in team ceremonies, knowledge sharing, brainstorming sessions etc.
- Becoming an observability champion, evangelising the best practices and guiding other Canvanauts in the observability space
- Finding solutions to make better use of logs and provide better insights to our engineers

You're probably a match if:

  • You are proficient and happy to code in Python, Java or Golang
  • You have deep knowledge and understanding of Computer Engineering fundamentals and first principles
  • You are proficient with infrastructure-as-code, we’re a Terraform shop, but strong experience with other IaC tools will do the trick
  • You have solid knowledge of AWS (EC2, Lambda, SQS, Kinesis, S3) or equivalent
  • You have experience with Observability tooling – having competency with tools like Elasticsearch, Kibana or similar
  • Experience with highly reliable and available distributed systems, with highly scalable databases

Not essential; but helpful experience

  • Have experience with OpenTelemetry because it’s going to underpin a lot of the tooling the team owns.
  • Have experience writing application code in Java since we also maintain the logging java libraries. 
  • Have experience building and running monitoring infrastructure at a large scale. For example, Petabyte-scale Elasticsearch clusters or similar databases.
  • Have experience with data handling at scale
  • Have experience with Kubernetes
  • Have experience with data security, data obfuscation, PII detection.
About the Reliability Platform Group:
The Reliability Platform Group is responsible for providing the tools and processes to scale reliability across all Canva services. Our teams work together, and with other groups, to deliver preventive and detective tooling, processes and best practices that uplift Canva’s reliability. We do this by driving operational excellence, reducing the impact of incidents, and providing visibility and accountability across the broader Engineering community.

About the team:
The Observability Logs and Events Team is part of the Observability sub-group and is responsible for the end-to-end experience for logs and events inside Canva. Our goal is to provide our development team with world-class tools to view how their services are performing in production. We achieve this by combining industry-leading third-party solutions with our own in-house developed solutions. We work across the entire stack maintaining logging SDK (Java, Golang to come), our logging pipeline and infrastructure. As we scale, all of these areas require more sophisticated solutions to ensure that Canva developers can continue to grow without compromising on reliability or availability.

What's in it for you?

Achieving our crazy big goals motivates us to work hard - and we do - but you'll experience lots of moments of magic, connectivity and fun woven throughout life at Canva, too. We also offer a stack of benefits to set you up for every success in and outside of work.

Here's a taste of what's on offer:
• Equity packages - we want our success to be yours too
• Inclusive parental leave policy that supports all parents & carers
• An annual Vibe & Thrive allowance to support your wellbeing, social connection, office setup & more
• Flexible leave options that empower you to be a force for good, take time to recharge and supports you personally

Check out for more info.

Other stuff to know

We make hiring decisions based on your experience, skills and passion, as well as how you can enhance Canva and our culture. When you apply, please tell us the pronouns you use and any reasonable adjustments you may need during the interview process.

Please note that interviews are conducted virtually.