The Spampede Filter
This post can also be found on the Grab Engineering blog
In Southeast Asia, when it rains, it pours. It’s a major mood dampener especially if you are stuck...
Post-Incident Questionnaire for Engineers
This is my light-hearted attempt to help engineers get the most value out of a downtime incident.
Getting Started with SRE – Step 2 – Dashboards
Introduction
In Part 1 of this series, we introduced the goal of understanding how our system performs by adding instrumentation. This article expands on this goal by taking...
Post-Incident Questionnaire for Managers
This is my light-hearted attempt to help engineering managers get the most value out of a downtime incident.
Introduction
So you had an incident? Condolences.On...
Designing resilient systems: Circuit Breakers or Retries? (Part 2)
This post can also be found on the Grab Engineering blog
This post is the second part of the series on Designing Resilient Systems. In Part 1, we looked...