A developer environment — known more commonly as a staging environment — is a replica of your production environment that’s designed specifically for testing. This setup has some obvious benefits: engineers can experiment with new ideas in a “safe” place, and no one needs to worry about breaking software in production.
While staging environments have some clear advantages, they’re not ideal for every organization or situation. In this article, we’ll examine the value that staging environments offer, as well as the obstacles they may present, before taking a look at methods for most effectively provisioning, maintaining, and managing the environments.
The value of staging environments
Staging environments offer significant value to engineers, especially as time goes on. As your software projects expand along with your customer base, it becomes more important to test changes before they land in production. You don’t want to release code that has just passed to all of your users without being sure that it works as intended. Ideally, you’ll have a method of testing changes in an environment that’s as closed to production as possible.
While automated testing is important, it can sometimes be too pristine. A staging environment offers engineers the opportunity to test changes manually, which can open their eyes to unexpected outcomes an automated test may not detect.
Staging environments also give you the opportunity to test at production-scale. By having a separate, dedicated environment that closely mimics production can help engineers uncover and address data drift. In the process, engineers can avoid the too-common problem of releasing something that works on a local setup, but fails for customers because of unexpected data.
As alluded to earlier, developer environments can serve as a “playground” for engineers to experiment with. A staging environment provides engineers with an outlet for their creativity and the opportunity to test out bold ideas without fear of breaking things.
Staging environments can serve as a playground for non-engineers, too. One of the best ways to learn something is by doing it, and by having these playgrounds available to everyone at your organization, people from customer success, sales, support, and other teams are empowered to experiment and learn. Ultimately, this fosters a greater understanding of your software across the board.
Why staging environments fail
Given the number of challenges a staging environment can help you overcome, it’s easy to dive head-first into adoption. However, developer environments aren’t necessarily easy to work with, and aren’t ideal for every team. In fact, there have been a number of companies that have recently chosen to reject staging environments completely. There are a few common pitfalls that lead teams to failure with developer environments.
Staging doesn’t mimic production
If a developer environment doesn’t mimic the production environment, team members will quickly lose faith in staging. If the staging environment doesn’t mirror production, then the behavior will be different from what users see in reality — sometimes dramatically so. This usually happens because the code running in staging hasn’t been maintained and has become outdated.
Lack of automation
Staging environments lack automation, which puts the pressure on engineers to manually update code, creating operational bottlenecks. Without a pipeline that automatically directs your code to staging, maintaining the staging environment becomes an uphill battle. Over time, this can snowball, leading to code that is too outdated to be usable in the developer environment.
Poor branching strategy
Staging should always run the same code as production, if not code that is ahead of production. Automation isn’t enough here — if your branching strategy is complicated, you’ll run into issues with code that’s out of sync with your source of truth. As with these other challenges, a poor branching strategy leads to degraded faith in the staging environment.
One-size-fits-all approach
While staging environments can address a number of engineering problems, a single staging environment cannot solve all of them at once. A team may have a very, very pristine pre-production environment that mimics production precisely, while your playgrounds can exist separately (with a mutual understanding that the developers that use the environments must take responsibility for fixing them, as well).
Wrong priorities
Given these challenges, staging environments are not ideal for every team or product. For example, during the early stages of a company, teams often optimize for speed, working to deliver output faster than ever in an effort to draw in more customers. During this stage, teams will be on the small side, usually fewer than 10 engineers. At a time like this, multiple environments are just more likely to slow teams down, even with the right automation in place. Ultimately, the time it would take to maintain those staging environments would be better spent working on the product.
Best practices
We can learn from these common challenges and embrace the benefits of staging environments to develop some best practices.
Branching strategy
With the advent of CI/CD tools and automation, teams no longer need a complex branching strategy. This is especially true with checks and balances in place, such as a strong testing suite and/or a solid code review process.
Continuous integration is particularly key. A simple branching strategy has one significant pitfall: without enough tests or coverage, you can run into too many regressions from code being merged from all different directions. By investing early on in CI, you can avoid trouble with branching down the road.
Ideally, production follows staging in your pipeline — as we’ve covered, once staging drifts away from production, people stop using the environment. Deploy the latest code to staging first, then deploy the same code to production. If you’re using a tool like Docker, you can deploy the same generated image to both environments with feature flags.
The simplest branching strategy is no strategy at all. Set up one main branch that first goes to staging and then to production.
Automation
Another major hurdle in adopting staging environments is the lack of automation. This can sometimes be ignored or deprioritized because a developer environment is not the production environment, but that mindset is exactly what perpetuates the issue. Without automated deployments in place, staging becomes outdated. With automation in place, however, you can minimize time to delivery and reduce failure rates.
Feature flags
When you have code that isn’t ready to be deployed to production, but that should be rolled out to internal teams, you can use feature flags. Feature flags enable you to label features based on different parameters, allowing you to enable a feature on your developer environments but not on production.
Feature flags mesh well with continuous delivery practices. You don’t necessarily want to wait and release massive pieces of code to a long-living feature branch — the potential for something to break is too great.
More importantly, though, feature flags allow you to develop and showcase features to internal teams without releasing to users.
Be conscientious when working with feature flags — you don’t want your codebase to be littered with hundreds of flags or more. Track usage of your flags and swiftly delete the ones that are no longer in use and those assigned to features that have been fully rolled out.
Maintenance
Just like any other engineering project or product, staging environments need love, maintenance, and downtime. Things can become outdated, branches can get stale, and the differences between staging and production can grow more significant.
Make sure managers and higher-ups understand the importance of time set aside for repair and maintenance. Conduct regular audits to see what has changed in production and whether it was applied to staging.
A lot of maintenance can be done through modern infrastructure practices, namely by using infrastructure-as-code to provision environments. IaC, especially when combined with a small branching strategy, ensures that staging and production servers are similar. Tooling has evolved greatly over the past few years, and allows for the importation of existing infrastructure to a brand new codebase. These kinds of tools allow you to start with a clean slate. Plus, they make future infra additions much less painful.
Open-source tools like terraformer can be extremely useful in these cases. Cloud providers like GCP, AWS, and Azure also have such tooling available.
Adapt and iterate
Remember: a one-size-fits-all approach doesn’t work. Make sure that you adapt everything you’ve learned so far to your specific organization and your team’s specific needs. The branching strategy and data replication practices that work for one team won’t necessarily work for another.
These complex changes can take months, sometimes years, to get right. Don’t be disheartened by failed attempts — instead, look for ways to deliver value in small pieces. Iterate changes and learn from the process.
Adding a staging environment
Now that we’ve established the benefits and drawbacks of staging environments, we’ll dive into some key steps for setting one up.
Gather information
There’s a lot of information to gather before you create a new environment. Make sure you know the services that need to be deployed, where they live, who their owners are, and where the DNS configuration is.
What infrastructure components need to be in place? Collect all the databases, load balancers, buckets, and resources you need for a completely dedicated environment.
If you have SaaS tools in place for error tracking and data analytics, gather your tooling. Note that you’ll likely want a different key to identify projects in staging.
Break it down
Divide the project into smaller pieces and assign ownership to each part. As with any other major engineering change or migration, this project needs oversight and close management. Each service will need a separate configuration for each environment. This is usually done with a YAML or JSON file, but can vary depending on your tech stack.
At this stage, the goal is to enable every service owner to independently (and successfully) deploy each service to the new environments, or to provide you with the configuration, so you can deploy the service in a pipeline.
While owners are working on their projects, identify the infrastructure changes you need on your end. Create a completely separate AWS, GCP, or Azure project, and confirm the unique best practices for your cloud provider platform.
Tying things together
Once information is gathered and project components are assigned, it’s time to deploy services manually. It’s best to start small: deploy a few service and see if they can interact with each other. You can also connect monitoring, SaaS, and error tracking tools at this stage.
Once you’ve successfully deployed services and completed end-to-end testing with a reasonable degree of confidence, you can move into automatic deployments. At this point, announce the staging environment to the rest of the team so they can play around with it.
Keep in mind
As helpful as a staging environment can be, it’s important to remember that it is not a silver bullet. Developer environments can introduce new considerations: Do you want to double your infrastructure budget for internal tooling? How do you generate data that is both similar to production and that doesn’t violate data privacy rules?
As with most things, it’s important to consider the big picture before adopting staging environments. They can enable autonomy across your organization, improve product quality, and empower engineers to take more control over infrastructure. They can also add operational and financial overhead.
While the decision ultimately depends on your organization and its unique needs, the benefits of staging environments usually outweigh the costs, especially in the long run.