Although service-level objectives (SLOs) continue to grow in importance, there’s a distinct lack of information about how to implement them. Practical advice that does exist usually assumes that your team already has the infrastructure, tooling, and culture in place. In this book, recognized SLO expert Alex Hidalgo explains how to build an SLO culture from the ground up. Ideal as a primer and daily reference for anyone creating both the culture and tooling necessary for SLO-based approaches to reliability, this guide provides detailed analysis of advanced SLO and service-level indicator (SLI) techniques. Armed with mathematical models and statistical knowledge to help you get the most out of an SLO-based approach, you’ll learn how to build systems capable of measuring meaningful SLIs with buy-in across all departments of your organization.
Quality material on advocacy for SLOs. Lots of good definitions and extensive advice on all aspects (advocate, adopt, implement, evolve) SLO-related. I also found it a bit repetitive, and skipped a couple of chapters.
I went into this book as part of a training week offered at my company and I am SO glad I took the time to read it. I have so many highlights to refer back to and this book was nothing less than transformational. I went from thinking of SLOs and SLIs as a meeting I need to go to every couple of weeks as part of my job description and figure it out to something I understand so much more and as a tool that will not only help my team understand the product statistics but rather to ensure that they know perfection is not something to be attained but rather that a realistic goal should be set so that they can feel fulfilled and the customer satisfied with our delivery.
Very thorough coverage of the topic with examples taken from domains that most will have had exposure to if not experience in. The exploration of how you go about introducing SLOs to a business and it's culture is really thought provoking and I found the coverage of SLOs for data processing particularly interesting as I've not seen it covered elsewhere. The only gripe, and a very minor one at that, is that there wasn't any particular advice for linking SLOs to SLAs even though that is the logical next step in adoption.
Great read for all kinds of engineers to better understand everything related to SLO development and implementation but also for managers looking to establish a SLO culture within their organization. I quite like the author's introduction to reliability incl. its contributing factors and the various examples throughout the book, e.g. how to architect a system for reliability.
Nice intro on the topic. The first part (chapters 1 to 5) are the most important part of the book where the core concepts are introduced (reliability stack, SLIs, SLOs, error budget). The two other parts feel a lot like the same things are being repeated over and over using different ideas and concepts. The exception is Chap 10 where a "real world" is presented and built upon.
The first three chapters and the chapter on how statistics help in understanding you system are wonderful, the rest feel like a cross between filler and trying to teach communicating and selling ideas to non engineers. While not a bad thing to have in a book, it felt out of place. Also the advice was oriented to very junior engineers.
A must read to anyone who wants to go down an SLO based approach to reliability.
Well written and lots of practical examples. Everything from what are SLIs, SLOs, SLAs and Error Budgets, to how to get people on board, how to advocate for this type of practice and so on.
Good and actionable advice on how to create SLOs. I liked how it helps frame the creation of SLO by starting with one or a few and not worrying about creating too many.
In summary; this is how you should start SLO culture.
I generally don't read all the chapters from IT books, rather I prefer selective topics as needed . But, there is something in this book.
I read all 350ish pages. Except two chapters 9 and 10 , all of them are more relevant to the central theme and explained well. Probability and architecture for reliability chapters needs more work to make it more aligned and to make it more useful.
Nice book to read. This might be the top one in this theme compare to other books I read on the SRE topics.
The "get it done" version of the SRE book when it comes to SLOs, which are a foundational component of the discipline. I keep it on my desk and find myself referring to it repeatedly and often.