(Courtney Nash’s excellent post on this topic inadvertently pushed me to finally finish this – give it a read) In the last post on this topic, I hoped to lay the foundation for what a mature role for automation might look like in web operations, and bring considerations to the decision-making process involved with considering...
Continue reading...
The other day I posted about the intersections of Systems Safety and web operations and engineering. One of the largest proponents of bringing a systems thinking perspective to safety (specifically ‘software safety’) is Dr. Nancy Leveson, who has been in that field (really a multidisciplinary field) for at least a couple of decades. She’s the...
Continue reading...
Anyone who has known me well knows that I’m generally not satisfied with skimming the surface of a topic that I feel excited about. So to them, it wouldn’t be a surprise that I’m now working on (yes, while I’m still at Etsy!) a master’s degree. Since January, I’ve been working with an incredible group...
Continue reading...
Something that has struck me funny recently surrounds the traditional notion of availability of web applications. With respect to its relationship to revenue, to infrastructure and application behavior, and fault protection and tolerance, I’m thinking it may be time to get a broaderĀ upgrade adjustment to the industry’s perception on the topic. These nuances in the...
Continue reading...
(Part 1 of 2 posts) I’ve been percolating on this post for a long time. Thanks very much to Mark Burgess for reviewing early drafts of it. One of the ideas that permeates our field of web operations is that we can’t have enough automation. You’ll see experience with “building automation” on almost every job...
Continue reading...
I make it no secret that my background is in mechanical engineering. I still miss those days of explicit and dynamic finite element analysis, when I worked for the VNTSC, working on vehicle crashworthiness studies for the NHTSA. What was there not to like? Things like cars and airbags and seatbelts and dummies and that...
Continue reading...
(this is part 2 of a series: here is part 1) One of the challenges of building and operating complex systems is that it’s difficult to talk about one facet or component of them without bleeding the conversation into other related concerns. That’s the funky thing about complex systems and systems thinking: components come together...
Continue reading...
I’m a firm believer that context is everything, and that it’s needed in every constructive conversation we want to have as engineers. As a nascent (but adorable) engineering field, we discuss (in blogs, books, meetups, conferences, etc.) success and failure in a number of areas, including the ways in which we work. We don’t just...
Continue reading...
I thought it might be worth digging in a bit deeper on something that I mentioned in the Advanced Postmortem Fu talk I gave at last year’s Velocity conference. For complex socio-technical systems (web engineering and operations) there is a myth that deserves to be busted, and that is the assumption that for outages and...
Continue reading...
In yet another post where I point to a paper written from the perspective of another field of engineering about a topic that I think is inherently mappable to the web engineering world, I’ll at least give a summary. š Every time someone on-call gets an alert, they should always be thinking along these lines:...
Continue reading...