Lead Site Reliability Engineer

Vancouver, BC
We’re looking for a top-tier engineer to join the SRE team at the technical core of an organization in rapid growth. The SRE is the beating heart of all projects, and your focus will be to drive development of tools and technologies to support a wide range of projects throughout their lives. You’ll play a key role in improving our in-house systems with your experience, and you’ll work closely with Engineering as a whole to research and apply the latest and greatest technology to our process. Through your efforts you’ll help shape the way a revolutionary new technology is introduced to and adopted by new audiences. Every day, you’ll collaborate with a world-class team in our Vancouver office.

Every one of us shares a common vision: to create the future we want to live in. We need the right people to help us realize that vision.

A little about us:
Dapper Labs is the company behind CryptoKitties. Formed in February 2018, Dapper Labs was spun out of Axiom Zen to spread the benefits of decentralization through the power of play, fairness, and true ownership. Notable investors in Dapper Labs include Andreessen Horowitz, Union Square Ventures, Venrock, Google Ventures, Samsung, and the founders of Dreamworks, Reddit, Coinbase, Zynga, and AngelList, among others. CryptoKitties is the world’s most popular blockchain application outside of cryptocurrency exchanges.

Dapper team members are humble and curious entrepreneurs, builders, and tinkerers who share a passion to demystify blockchain technology and tap its potential to create change in the world. Our people are our greatest strength: our diverse crew flourishes in a distributed hierarchy where personal autonomy and professional growth are encouraged. We value our culture above else: regardless of where you came from, what you studied, or who you used to work for, your role here will necessitate both a high level of creativity and strategic thinking on complex issues. Everyone here is a founder, and no one fits in a box. We’re all driven by an insatiable thirst for learning and development, and that’s what brings us together.
What we'll accomplish together:
  • Developing effective systems for our projects to deploy onto, ensuring all projects are scalable, resilient, and reliable in support of growing products.
  • Designing new processes to improve our ability to ship fast while maintaining a high quality system that we can depend on.
  • Building new tools and automation to fill the gaps in our current systems as well as build entirely new ones as we face bigger and more complex challenges.
  • Working to respond to incidents when they arise to restore operations when we can, and support the larger Engineering team in their incident responses.
  • Performing post mortems and in-depth root cause analysis to ensure we are always improving.
A little about you:
  • You have experience working with container orchestration systems like Kubernetes, Normad, Mesos, etc.
  • You have experience deploying orchestration from scratch both on-prem and using cloud providers.
  • You have experience collecting and processing metrics from tools like Prometheus/Datadog/NewRelic.
  • You have experience building and working on deployment systems.
  • You have experience building pipelines to facilitate different deployment methods based on project requirements such as blue/green, canary releases, etc.
  • You have experience negotiating contracts with cloud providers in order to deliver noteworthy cost savings to an organization.
  • You have experience working with project stakeholders in developing SLO and SLI targets and implementing relevant metrics to ensure targets have been met.
  • You have past experience with leading and organizing production incident response and blameless postmortem exercises.
  • You are comfortable with responding to production incidents and can fight fires with a calm and level head.
  • You have experience coding and developing applications. Bonus points for Go experience.
  • You have experience working with Infrastructure as Code systems like Terraform or CloudFormation.
  • You are comfortable diving into an unfamiliar system and finding your way around.
  • You have a strong ability to collaborate with cross-functional teams and build solid working relationships with everyone in the organization, from individual contributors to the CEO.
  • We believe in processes and the power of planning, but you will often have to roll with the punches and prioritize the most impactful tasks on the fly.