It can be difficult to evaluate web ops candidates, for a couple of different reasons. One is that the breadth of knowledge needed for the field can be pretty wide, so spending too much time on any particular technical area can be a waste of time. Another reason is that it can be difficult to gauge how collaborative someone’s demeanor is in an interview. Collaboration is a requirement at Etsy. 🙂
So in addition to the standard technical questions, I like to ask high-level questions where the answers can zoom in and out of a larger picture within the operations context.
- Diagram the current architecture you’re responsible for, and point out where it’s not scalable or fault-tolerant.
- What are some examples of how you might scale a read-heavy application? Why?
- What are some examples of how you might scale a write-heavy application? Why?
- Tell me how code gets deployed in your current gig, from developer’s brain to production.
- Tell the story of the best-run outage you’ve been a part of, in as much detail as you can. What made it “good”?
- Tell the story of the worst-run outage you’ve been a part of, in as much detail as you can. What made it “bad”?
- What is the purpose of a post-mortem meeting?
- How do you handle (and feel about) making changes (code/schema/network/etc.) in your current environment?
These are purposefully open-ended questions meant to dig into what’s important to you as someone responsible for the performance and availability of a growing website. This is just a snippet of what we normally ask, in addition to my (and Jesse‘s) favorite interview question.
So: maybe you should take a look at the type of ops engineers we’re looking for, and apply? 🙂