Features

Reliably is your ‘Reliability as Code’ platform

Made for

Reliably helps Site Reliability Engineers and Development Teams shoulder the burden of building predictable, reliable systems.

Get Started

Conversation Starter

SLOs and error budgets are a great way to start discussing what everyone should be doing next.

Decision Maker

Is it the right moment to experiment on new features, or to scale up your infrastrucure? With the right data, it's easier to make the right move.

Business Enabler

Happy users are vital to your business. By defining what is meaningful to them, you're able prioritize where you invest.

For Developers

Collaborate on Reliability
  • Developer-centric

    Reliably is always where you are, never in your way. Our CLI integrates nicely with your favorite CI/CD pipeline, to provide you with useful information when you need it.

  • Everything is Code

    Because SLOs are declared as YAML manifests, you can edit them, version them, and share them the same way you work with all of your codebase.

  • Error-free deployments

    You take pride in shipping beautiful and robust code. But how do you run it? The reliably scan command makes sure your Kubernetes manifests and clusters follow the best practices and provide you with solutions when they don't.

  • Ship at the right moment

    Report of SLOs status in your continuous delivery pipeline mean you can use them as guardrails as to when deploying that new feature, or when it's better to work on making your application more robust.

The Reliably CLI

The Reliably CLI helps developers and everyone who cares about reliability introduce best practices into their current workflow, in a non-disruptive way.

Get Started

reliably slo report
                                       Current  Objective  / Time Window     Type             Trend    
  Service #1: http-api                                   
   99% availability over 1 hour      100.00%        99%  / 1h0m0s          Availability     ✓ ✓ ✓ ✓   
   99% of requests under 300ms        73.91%        99%  / 1d              Latency          ✕ ✕ ✕ ✕   
                                                         
  Service #2: products-api                               
   99% availability over 1 day       100.00%        99%  / 1d              Availability     ✓ ✓ ✓ ✓
reliably scan kubernetes --format table
Results:
  manifests/deployment.yaml  Kubernetes:Deployment         K8S-DPL-0007  Setting a high cpu request may render pod scheduling difficult or starve other pods 
  manifests/deployment.yaml  Kubernetes:Deployment         K8S-DPL-0009  Not setting a cpu requests means the pod will be allowed to consume the entire available CPU (unless the cluster has set a global limit)
  manifests/deployment.yaml  Kubernetes:Deployment         K8S-DPL-0013  A rollout strategy can reduce the risk of downtime 
  manifests/deployment.yaml  Kubernetes:Deployment         K8S-DPL-0014  Without the 'minReadySeconds' property set, pods are considered available from the first time the readiness probe is valid. Settings this value indicates how long it the pod should be ready for before being considered available.
  manifests/deployment.yaml  Kubernetes:Deployment         K8S-DPL-0001  You should specify a number of replicas 
  manifests/pod.yaml         Kubernetes:Pod                K8S-POD-0001  You should not use the default 'latest' image tag. It causes ambiguity and leads to the cluster not pulling the new image
  manifests/pod.yaml         Kubernetes:Pod                K8S-POD-0003  Only images from an approved registry can be run 
  manifests/deployment.yaml  Kubernetes:Deployment         K8S-DPL-0012  Image pull policy should usually not be set to 'Always' 
  test-manifest.yaml:92:1    Kubernetes:PodSecurityPolicy  K8S-PSP-0001  Enabling privileged can lead to unwanted escalation from the container's process 
  test-manifest.yaml:92:1    Kubernetes:PodSecurityPolicy  K8S-PSP-0007  To reduce risk of accessing files outside of an allowed paths, it's best to make them read only 
Summary:
  10 suggestions found
   3 info -  5 warning -  2 error

Reliably SLO Report

Service Level Objectives identify what you should care about on your system. They are what good looks like for the users of your system. If an SLO is underperforming, it will be impacting your users in some way.

Expand for further SLOs with Reliably

The Reliably CLI allows you to define SLOs for Availability and Latency.

An Availability SLO allows you to specify a target availability percentage for a Service.

A Latency SLO allows you to specify a threshold latency for a service and a target percentage. The percentage gives the target percentage of responses within that threshold latency.

For more details of an SLO report, see the Reliably documentation on How the Reliably CLI works

Report time: Wed, 19 May 2021 10:46:07 UTC

Service #1: Service GTF

Name Current Objective Time Window Type Trend
95% of requests successful over last 1 hour 95% 1 hour availability ✓ ✓
98% of requests faster than 350ms over last 1 day 98% 1 day latency ✕ ✕

Error Budget

When you define and SLO for you system, you include a target percentage for that SLO. An example target could be 95%. That leaves 5%, that is your error budget for your SLO.

Expand for further inforamtion on Error Bugets with Reliably

When you define an SLO with the Reliably CLI you specify a target percentage for the Availability or Latency for that SLO.

The expectation is over a time window, the responses for the Serice will be within that target percentage.

An example could be 99.5% available over a period of 7 days. The target availability is less than 100% so this leaves a margin for error. This can be considered the Error Budget.

The Error Budget metrics are:

Type Name ErrorBudget(%) Downtime Consumed Remain
availability 95% of requests successful over last 1 hour 5.00% 18s 0s 18s
latency 98% of requests faster than 350ms over last 1 day 2.00% 2m52s 2h11m35s (+2h8m43s) 0s

Generated by the Reliably CLI Version 0.16.0 (2021-05-19)

  • SLO monitoring

    Declare SLOs as code and get actionable reports right where you need them: in the CLI, in your continuous delivery pipeline, or as PDF to share with your teammates or manager.

    Read the docs

  • Kubernetes Scan

    Are you sure all your Kubernetes manifests or clusters follow best practices? The Reliably CLI scans them for potential vulnerabilities and come up with fixes you can implement right away.

    Read the docs

  • CI/CD integration

    Make reliability a part of your development process by automating SLO reports or Kubernetes scan with your favorite CI/CD. You can even use those reports as gates to block a pipeline in certain conditions.

    Read the docs

  • Open Source

    Reliably loves and supports open source. The CLI is developed as an open source project with an Apache-2.0 License, so you can understand exactly what happens under the hood, or even contribute.

    GitHub repository

Integrations

Connect your favorite tools

Easily and efficiently fetch raw data from your cloud services provider or monitoring solution.
Seamlessly connect to your continuous delivery pipeline.

Fetch Metrics

  • AWS

    Fetch metrics from your AWS services

    Get Started
  • Google Cloud Platform

    Google Load Balancer metrics provider

    Get Started
  • Datadog

    Import Datadog SLOs metrics into Reliably

    Coming Soon

CI/CD Integrations

For SREs

Enable your team
  • SLOs for everyone

    Make SLOs accessible to everyone, by displaying reports in GitHub PRs, or in continuous delivery pipelines. Never lose time telling people to look at dashboards anymore!

  • Chaos Engineering

    Validate your reliability efforts with powerful and flexible chaos engineering experiments. Made by the same people as Reliably, the popular open source Chaos Toolkit has proven to be a solution of choice for thousands of reliability experts. And with the upcoming Reliably Chaos Toolkit extension, it's even easier to experiments only when your SLOs are healthy.

For managers

Business logic meets reliability
  • Speak a common language

    You might sometimes have the feeling the development teams follow a different agenda than yours. You care about ROI and user happiness, while they work on features, issues, and availability. SLOs are a way to quantify what is important to you, in a language that is shareable across the whole organization.

  • Users-first SLOs

    You know what your users care about, but you're not sure how it relates to measurable metrics? With Reliably SLOs you can declare what is important to you, and find out later how to measure it. Or make it you reliability team's mission.

Start your journey now

Get Started