Intelligent Observability for SREs

Recorded | Wed, March 25, 2021

Toil and trouble go together like peanut butter and jelly but are not nearly as sweet. For Dinesh, one SRE’s relentless approach to eliminating toil might sometimes feel like ‘whack-a-mole.’ Thankfully automation is the cure for unreliability. However, the number of tools available and in use can be overwhelming, so SREs like him need to pick their fire-fighting equipment carefully.

In this webinar, DevOps Institute Chief Ambassador Helen Beal and Moogsoft Director of SRE Thom Duran will explore why intelligent monitoring is an SRE’s best friend. Join us as we discuss the day-to-day impact of leveraging AI-driven observability for:

  • Monitoring SLIs and SLOs
  • The Rule of 3 and automation effort
  • Prioritizing reliability over functionality
  • Managing tool heterogeneity and proliferation
  • Chaos engineering for antifragility

You’ll learn:

  • How to measure system reliability over time
  • Why toil causes so much trouble
  • Where to prioritize effort
  • AIOps tips and techniques for SREs


Helen BealHelen Beal
Chief Ambassador, DevOps Institute

Thom DuranThom Duran
Director of SRE, Moogsoft

Watch Now

Coffee Break with Helen Beal: Intelligent Observability for SREs