SRE Blog

Better living through reliability.


2020-08-15 MemoryUsageTooHigh Alert
2020-08-13 JVMHeapUsageTooHigh Alert
2020-08-11 GoroutinesTooHigh Alert
2020-08-07 Postmortem Tip of the Day: Summaries
2020-08-05 Postmortem Tip of the Day: Root Causes
2020-08-04 FileDescriptorsTooHigh Alert
2020-08-03 Proactive vs Reactive Alerts
2020-08-02 Tips to Prevent Spammy Alerts
2020-08-01 DiskUsageTooHigh Alert
2020-07-30 CPUUsageTooHigh Alert
2020-07-29 Base Alerts
2020-07-27 Pre-Postmortem Meetings


I'm an SRE and so can you! This blog translates SRE principles into practical, concrete advice. SRE books outline the high level princples of SRE, but often folks still struggle when implementing alerts, production reviews, SLOs, error budgets, etc. I'll help you supercharge your SRE skills, just as if you were a member of my SRE team!


If you enjoy the content or have a question you'd like me to answer on the blog, email sreblog@.