Despite there being significant differences in the roles, DevOps and Site Reliability Engineering are often lumped together because many people assume they do similar work. Although both attempt to reduce the issues arising from software development processes, their goals, skill sets, and approaches are actually quite different.
DevOps engineers focus on the development pipeline, and their goal is to enable better development processes and workflows. With their help, your engineering team will be able to build more quickly and with less hassle.
SREs focus on the reliability and stability of your product once it’s out in the world. With their help, your team will design more stable systems, and your product will recover from incidents more quickly.
In a nutshell, DevOps is focused on improving your pre-production processes so it’s easier to develop and release your product. SREs focus on making the product easier to run in production.
What do DevOps engineers do?
As the name suggests, DevOps engineers improve the operations of your development team. They try to make building and deploying code easier. This includes setting up things like CI/CD pipelines, local environments, or build tooling. For instance, your DevOps team might set up a process where every PR automatically spins up a staging environment for testing.
They introduce best practices for software engineers and provide support for them, e.g. setting up a process where, if your SWEs use some preferred set of frameworks and pipelines, they get testing and monitoring set up automatically. They also enforce development standards.
What do SREs do?
SREs try to make the product more stable and reliable. Although they also work with engineers in pre-production, the SRE’s focus is how the product will work in production.
In pre-production, they advise developers on the technical design and make recommendations on how to improve stability, increase uptime, meet SLOs and SLAs, etc. If any part of the system breaks, SREs will lead remediation,, so they are incentivized to prevent issues from coming up in the first place.
While services are running in production, they keep an eye on everything from the services and underlying infrastructure, to make sure they are running smoothly. This includes monitoring and keeping track of SLAs, as well as setting up reporting for important metrics like uptime and latency.
If anything does go wrong from a reliability standpoint, they’re the first line of defense. They maintain runbooks that define processes for handling incidents and will usually be the first to notice and respond when an incident occurs.
How does the day-to-day work differ for SRE vs. DevOps?
The day-to-day work of DevOps engineers and SREs is markedly different.
DevOps engineers will have more hands-on work, like scripting and building out pipelines. The role is more like software engineering than the SRE role is.
In an operational role, DevOps engineers also need to be aware of the processes that engineers are following, understand the goals and bottlenecks in them, and build out tools to optimize workflows.
SREs need an engineering background, but their role is less about direct engineering work and more about the application of engineering concepts to support the product. They consider higher-level infrastructure decisions and practice systems thinking to guide developers on goals like improving resiliency or setting up scalable load balancing.
SREs also deal with operations in setting up incident responses. Beyond the base level of alerting and monitoring, they need to know how to investigate and escalate to the right people in case of an incident.
Finally, SREs enforce production readiness and reliability standards. They make sure that products are ready to be launched and handed off to their team as the first line of defense.
Where do they overlap?
Before and after a product release, the line between DevOps and SRE is fairly clear—but what about for the release itself? This is where most people’s understanding of the roles starts to blur. However, DevOps engineers and SREs have clear responsibilities even during releases.
DevOps usually focuses on the actual release and the steps needed to push the product out. SREs, on the other hand, ensure the product is ready for release operationally; i.e. does it have monitoring and alerting set up, are there defined reliability standards, is there an on-call rotation? Then, once it’s released, SREs will monitor to confirm the product meets reliability standards.
Though their focus is different, the two teams should have a tight partnership. Most organizations usually have separate DevOps and SRE teams, but will roll them up into a larger org like Eng Productivity to keep their outcomes aligned.
An important collaboration is how SREs and DevOps engineers work together to make reliability as easy and automatic as possible. SREs will define what readiness means in production. DevOps makes it easy to meet those standards by setting up best practices and pipelines for engineers. The two teams need to align on what DevOps can reasonably accomplish, so then DevOps can translate those into tangible steps for their pipelines. If SREs are setting reliability standards that DevOps can’t build tools to meet, it’s unlikely that SWEs will be able to meet those standards with their own manual effort.
How does someone get into DevOps vs. SRE as a career?
Both SREs and DevOps engineers often come from sysadmin backgrounds. However, a good fit for a DevOps role usually has experience with the low-level tasks involved in sysadmin, like setting up Ansible workflows and spinning up VMs.
SREs usually come from a sysadmin background where they were setting up and running high-level systems, e.g., managing networks, load balancing, or RPCs. They can also come from SWE backgrounds, as the role can be very technical. Prior experience in software engineering can be valuable to understanding the full picture of the product.
DevOps is usually the easier of the two to transition into. SRE requires more specialized knowledge and experience.
Should I hire a DevOps engineer or SRE first?
If you’re a growing startup, it’s more likely you’ll need a DevOps engineer first.
At the early points in your growth, your software engineers should still be thinking about the SRE concerns themselves. Setting up alerting and monitoring isn’t too much extra work and building stable systems is closely related to what your SWEs should be doing already.
On the DevOps side, though, there are likely lots of things that can be done to make your team more productive. Developers may find themselves spending time managing CI/CD issues, or they may wish they had a solid staging environment for testing. Building these improvements is less directly connected to the work your SWEs should be doing. Adding a DevOps engineer to handle these process issues will make your team faster and keep your developers focused on the product.
The need for an SRE comes once your Eng team has grown to the point that they can’t manage reliability themselves. If you have multiple teams all defining and tracking reliability differently and you need to enforce more rigor, that’s a good time to consider adding an SRE.
Cortex supports your DevOps engineers and SREs
Both your DevOps and SRE teams will work with and benefit from using Cortex. To help engineers keep track of all your services, your DevOps team can set up your Service Catalog and route data into it.
With Scorecards, SREs can define standards for production readiness and DevOps engineers can use them to set up best practices. Scorecards will also make it easy to see what teams have migrated to following best practices.
Set up a demo today and see how Cortex can support your DevOps and SRE teams.