Why are SLOs important?
SLOs allow you to have a focused definition of what is important for your users and are a decision-making tool that allows your team to decide what to work on next, based on how your system is currently behaving.
They’re considered a first citizen in all high-functioning product and SRE teams, but like all tools, they might be difficult to approach. Reliably makes that first step easier and explores your system to automatically create your first set of objectives.
Getting started with objectives
Starting your reliability journey might be scary, with numerous things to tend for. Reliably makes those first steps easier by creating a first set of objectives for you, by exploring your existing infrastructure or repositories.
Use the Reliably CLI to make Reliably aware of everything you care about. Here is how you tell Reliably to create objectives for your AWS infrastructure.
reliably populate aws
We currently support AWS, GCP, Kubernetes, Dynatrace, Datadog, and GitHub, and are working on adding new providers.
Here is what you get if you use the reliably populate command on a GitHub repository.
Once the Reliably agent starts getting data for your objective, its displayed in the objective view, allowing you see it’s current status, as well as tracking it’s history and how it relates to your system.
SLOs as a discussion-enabling tool
Now that you have your objectives defined and receiving data, what are you going to do with them?
Of course, you’re going to use them to monitor your services’ health and behavior. If you create alert policies they’re valuable to know when something is going wrong and immediate action must be taken.
They’re also a device to start discussing what “good” looks like for your organization. There are “natural” objectives that you want to be monitoring, mostly those based around latency and availability of major services. But maybe you can come up with new indicators that your system is performing correctly. For example, major streaming services have noticed that a good metric of their overall system’s health was the number of streams starting per minute. If this number is steady, it means the system is performing as intended, and users are happy. What could be such a metric for your own organization?
Objectives are also a powerful tool to decide what to do next. Are your objectives constantly overperforming their target? Maybe now is the right moment to develop and deploy new features. The potential impact of change can probably be absorbed by your remaining error budget. On the other end, if you see a downward trend, you can decide to allow some time to work on making it more reliable.
- What are the benefits of creating SLOs using Reliably?
- You can combine SLOs with scorecards and alerts to get intelligent suggestions on what to do next.
- Does Reliably integrate with my workflow?
- We support a range of data providers such as AWS, GCP, Datadog, and more. And we're working on making that list grow bigger. Reliably is also the best place to push your Chaos Toolkit chaos experiments results, to see how turbulent conditions might impact your objectives.
- Can I get a free trial?
- Yes, sign up for free (no credit card required) and get access to features such as objectives, scorecards, and alerts. You can cancel any time and you won't be charged anything.