If you want to adopt SRE you need to understand the 7 SRE principles:
- Embrace Risk
- Service Level Objectives
- Eliminate Toil
- Monitoring
- Automation
- Release Engineering
- Simplicity
These principles are the pillars of the SRE. These principles should be the same in all organizations independent of the size, department, the way that they work, business, etc. The principles are immutable and we need to have them in mind when we are applying the SRE.
Embrace Risk
Embrace risk means that we know that there is no 100% reliability. Is impossible there is no way to achieve that, we depend on different services and digital products that can fail.
The Embrace risk also means that we know that reliability costs, are not for free and we have limited money to spend, on reliability.
We need to amaze our customers with innovation, amaze our customers with what they want, new features. In order to deliver new features, we need to take the RISK.
Service Level Objectives
The second principle is to establish or define the SLOs. We measure the reliability of defining some indicators with SLIs using the SLA and SLO.
We should amaze our customers with reliable services.
Eliminate Toil
First of all, we need to define what Toil is, Toil is tasks that didn’t bring any value in a long term. We can say repetitive and boring tasks.
When we say that doesn’t bring value, we say that those tasks are not focused on the business. These tasks it seems that we are not waste too much time, maybe are just some minutes, but we have a lot of these types of tasks all the days we are wasting not just a lot of time, and money.
When we have Toil means that we need more people when we increase the work. The Toil is difficult to identify.
Eliminate the Toil it’s also related to how can we manage big and complex systems if we need to hire a new employee or every time that we have a new component.
Eliminating the Toil allows a single team to manage a big and complex system.
To eliminate the toil you need to eliminate everything that is considered a waste. In the end, eliminating toil means INVEST ON PEOPLE.
Monitoring
Monitoring means that we collect, process, and analyze the data that came from the systems or begin sent by the systems.
When we monitor, we know what is happening in our system, Monitoring means knowing if the system is healthy or not.
When we monitor we can know:
- The system history: For example what happened in the system during some part of the year.
- The system behavior: For example the behavior during the peaks and valleys on specific hours of the day, etc.
- Detect waste: For example, review if there is much use of resources and verify if we can reduce the resources or not.
Monitoring allows us to start having proactive behavior instead of just waiting for negative customer feedback.
Automation
We call automation when we let a machine do our job. With automation, there are no humans involved, so there is no chance of some human error reducing risks and being more efficient thanks to it and the faster machine. Automation saves time and money.
Through automation, we are able to scale, because when we have our business succeeds, we have more customers, and then we have more work to do.
Without automation, we need to scale the number of people, but the idea is to scale the amount of work without scaling the number of people.
Release Engineering
Release Engineering it’s all about having control over what it’s delivered all the whole process.
With release engineering, we know all the processes of all the steps involved from conception to production thanks to this we can put some quality gateways between each process.
This principle indicates that we can safely deliver without impacting reliability.
Simplicity
Simplicity means that we focus on what is essential, what is matter to the customers. Simplicity also means that we need to think about the final user experience, both for internal and external users.
We need to reduce the noise (all the information, data, and info that overwhelms us) to be simple.
Simplicity also means that we need to reduce the costs for all. reduce the usage and the resources to the bare minimum. Remember simplicity is sharing the knowledge because simple solutions does the systems easy to handle.
To achieve Simplicity we need to have the necessary discipline to be simple. And Simplicity is what we need for reliability.
Now is time to review the 7 SRE Practices.


Leave a comment