Competitive pressures in many domains, as well as development paradigms such as Agile and DevSecOps, have led to the increasingly common practice of continuous delivery or continuous deployment—rapid and frequent changes and updates to software systems. In today’s systems, releases can occur at any time—possibly hundreds of releases per day—and each can be instigated by a different team within an organization. Being able to release frequently means that bug fixes and security patches do not have to wait until the next scheduled release, but rather can be made and released as soon as a bug is discovered and fixed. It also means that new features need not be bundled into a release but can be put into production at any time. In this blog post, excerpted from the fourth edition of Software Architecture in Practice, which I coauthored with Len Bass and Paul Clements, I discuss the quality attribute of deployability and describe two associated categories of architecture patterns: patterns for structuring services and for how to deploy services.
Continuous deployment is not desirable, or even possible, in all domains. If your software exists in a complex ecosystem with many dependencies, it may not be possible to release just one part of it without coordinating that release with the other parts. In addition, many embedded systems, systems residing in hard-to-access locations, and systems that are not networked would be poor candidates for a continuous deployment mindset.
This post focuses on the large and growing numbers of systems for which just-in-time feature releases are a significant competitive advantage, and just-in-time bug fixes are essential to safety or security or continuous operation. Often these systems are microservice and cloud-based, although the techniques described here are not limited to those technologies.
Deployment is a process that starts with coding and ends with real users interacting with the system in a production environment. If this process is fully automated—that is, if there is no human intervention—then it is called continuous deployment. If the process is automated up to the point of placing (portions of) the system into production and human intervention is required (perhaps due to regulations or policies) for this final step, the process is called continuous delivery.
To speed up releases, we need to introduce the concept of a deployment pipeline: the sequence of tools and activities that begin when you check your code into a version control system and end when your application has been deployed for users to send it requests. In between those points, a series of tools integrate and automatically test the newly committed code, test the integrated code for functionality, and test the application for concerns such as performance under load, security, and license compliance.
Each stage in the deployment pipeline takes place in an environment established to support isolation of the stage and perform the actions appropriate to that stage. The major environments are as follows:
- Code is written in a development environment for a single module where it is subject to standalone unit tests. After it passes the tests, and after appropriate review, the code is committed to a version control system that triggers the build activities in the integration environment.
- An integration environment builds an executable version of your service. A continuous integration server compiles your new or changed code, along with the latest compatible versions of code for other portions of your service and constructs an executable image for your service (any independently deployable unit). Tests in the integration environment include the unit tests from the various modules (now run against the built system), as well as integration tests designed specifically for the whole system. When the various tests are passed, the built service is promoted to the staging environment.
- A staging environment tests for various qualities of the total system. These include performance testing, security testing, license conformance checks, and possibly user testing. For embedded systems, this is where simulators of the physical environment (feeding synthetic inputs to the system) are brought to bear. An application that passes all staging environment tests—which may include field testing—is deployed to the production environment, using either a blue/green model or a rolling upgrade. In some cases, partial deployments are used for quality control or to test the market response to a proposed change or offering.
- Once in the production environment, the service is monitored closely until all parties have some level of confidence in its quality. At that point, it is considered a normal part of the system and receives the same amount of attention as the other parts of the system.
You perform a different set of tests in each environment, expanding the testing scope from unit testing of a single module in the development environment, to functional testing of all the components that make up your service in the integration environment, and ending with broad quality testing in the staging environment and usage monitoring in the production environment.
But not everything always goes according to plan. If you find problems after the software is in its production environment, it is often necessary to roll back to a previous version while the defect is being addressed.
Architectural choices affect deployability. For example, by employing the microservice architecture pattern, each team responsible for a microservice can make its own technology choices; this removes incompatibility problems that would previously have been discovered at integration time (e.g., incompatible choices of which version of a library to use). Since microservices are independent services, such choices do not cause problems.
Similarly, a continuous deployment mindset forces you to think about the testing infrastructure earlier in the development process. This “shift left testing” is necessary because designing for continuous deployment requires continuous automated testing. In addition, the need to be able to roll back or disable features leads to architectural decisions about mechanisms, such as feature toggles and backward compatibility of interfaces. These decisions are best taken early on.
Deployability refers to a property of software indicating that it may be deployed—that is, allocated to an environment for execution—within a predictable and acceptable amount of time and effort. Moreover, if the new deployment is not meeting its specifications, it may be rolled back, again within a predictable and acceptable amount of time and effort. As the world moves increasingly toward virtualization and cloud infrastructures, and as the scale of deployed software-intensive systems inevitably increases, it is one of the architect’s responsibilities to ensure that deployment is done efficiently and predictably, minimizing overall system risk.
To achieve these goals, an architect needs to consider how an executable is updated on a host platform, as well as how it is subsequently invoked, measured, monitored, and controlled. Mobile systems in particular present a challenge for deployability in terms of how they are updated because of bandwidth constraints. Some of the issues involved in deploying software are as follows:
- How does it arrive at its host (i.e., push, where updates deployed are unbidden, or pull, where users or administrators must explicitly request updates)?
- How is it integrated into an existing system? Can this be done while the existing system is executing?
- What is the medium, such as USB drive or Internet delivery?
- What is the packaging (e.g., executable, app, plug-in)?
- What is the resulting integration into an existing system?
- What is the efficiency of executing the process?
- What is the controllability of the process?
With all of these concerns, the architect must be able to assess the associated risks. Architects are primarily concerned with the degree to which the architecture supports deployments that are
- Granular—Deployments can be of the whole system or of elements within a system. If the architecture provides options for finer granularity of deployment, then certain risks can be reduced.
- Controllable—The architecture should provide the capability to deploy at varying levels of granularity, monitor the operation of the deployed units, and roll back unsuccessful deployments.
- Efficient—The architecture should support rapid deployment (and, if needed, rollback) with a reasonable level of effort.
Architecture Patterns for Deployability
Architecture patterns for deployability can be organized into two categories. The first category contains patterns for structuring services to be deployed. The second category contains patterns for how to deploy services, which can be parsed into two broad subcategories: all-or-nothing or partial deployment. The two main categories for deployability are not completely independent of each other because certain deployment patterns depend on certain structural properties of the services.
Pattern for Structuring Services
Microservice architecture—The microservice architecture pattern structures the system as a collection of independently deployable services that communicate only via messages through service interfaces. There is no other form of interprocess communication allowed: no direct linking, no direct reads of another service’s data store, no shared-memory model, no back-doors whatsoever. Services are usually stateless, and (because they are developed by a single relatively small team) are relatively small—hence the term microservice. Service dependencies are acyclic. An integral part of this pattern is a discovery service so that messages can be appropriately routed.
Patterns for Complete Replacement of Services
Suppose there are N instances of Service A and you wish to replace them with N instances of a new version of Service A, leaving no instances of the original version. You wish to do this with no reduction in quality of service to the clients of the service, so there must always be N instances of the service running.
Two different patterns for the complete replacement strategy are possible, both of which are realizations of the scale rollouts tactic. We’ll cover them both together:
- Blue/green—In a blue/green deployment, N new instances of the service would be created and each populated with new Service A (let’s call these the green instances). After the N instances of new Service A are installed, the DNS server or discovery service would be changed to point to the new version of Service A. Once it is determined that the new instances are working satisfactorily, then and only then are the N instances of the origi- nal Service A removed. Before this cutoff point, if a problem is found in the new version, it is a simple matter of switching back to the original (the blue services) with little or no interruption.
- Rolling upgrade—A rolling upgrade replaces the instances of Service A with instances of the new version of Service A one at a time. (In practice, you can replace more than one instance at a time, but only a small fraction are replaced in any single step.) The steps of the rolling upgrade are as follows:
a. Allocate resources for a new instance of Service A (e.g., a virtual machine).
b. Install and register the new version of Service A.
c. Begin to direct requests to the new version of Service A.
d. Choose an instance of the old Service A, allow it to complete any active processing, and then destroy that instance.
e. Repeat the preceding steps until all instances of the old version have been replaced.
Patterns for Partial Replacement of Services
Sometimes changing all instances of a service is undesirable. Partial-deployment patterns aim at providing multiple versions of a service simultaneously for different user groups; they are used for purposes such as quality control (canary testing) and marketing tests (A/B testing).
- Canary testing—Before rolling out a new release, it is prudent to test it in the production environment, but with a limited set of users. Canary testing is the continuous deployment analog of beta testing. Canary testing is named after the 19th-century practice of bringing canaries into coal mines. Coal mining releases gases that are explosive and poisonous. Because canaries are more sensitive to these gases than humans, coal miners brought canaries into the mines and watched them for signs of reaction to the gases. The canaries acted as early warning devices for the miners, indicating an unsafe environment.
Canary testing designates a small set of users who will test the new release. Sometimes, these testers are so-called power users or preview-stream users from outside your organization who are more likely to exercise code paths and edge cases that typical users may use less frequently. Users may or may not know that they are being used as guinea pigs—er, that is, canaries. Another approach is to use testers from within the organization that is developing the software. For example, Google employees almost never use the release that external users would be using, but instead act as testers for upcoming releases. When the focus of the testing is on determining how well new features are accepted, a variant of canary testing called dark launch is used.
In both cases, the users are designated as canaries and routed to the appropriate version of a service through DNS settings or through discovery-service configuration. After testing is complete, users are all directed to either the new version or the old version, and instances of the deprecated version are destroyed. Rolling upgrade or blue/green deployment could be used to deploy the new version.
- A/B testing—A/B testing is used by marketers to perform an experiment with real users to determine which of several alternatives yields the best business results. A small but meaningful number of users receive a different treatment from the remainder of the users. The difference can be minor, such as a change to the font size or form layout, or it can be more significant. The “winner” would be kept, the “loser” discarded, and another contender designed and deployed. An example is a bank offering different promotions to open new accounts. An oft-repeated story is that Google tested 41 different shades of blue to decide which shade to use to report search results.
As in canary testing, DNS servers and discovery-service configurations are set to send client requests to different versions. In A/B testing, the different versions are monitored to see which one provides the best response from a business perspective.
The Increasing Importance of Deployability
Deployability is, relatively speaking, a new system concern. But it is taking on increasing importance as the world of software moves more and more to cloud-based, microservice-based deployments. Like any other property of a software-intensive system, deployability can be designed for and managed through proper attention to architecture. In fact, you will not achieve high release velocity and high quality without such attention. Even though this is couched as a warning, however, it is really good news. It means that you, as an architect, can plan for deployability and can achieve it, just as you would achieve high performance or high modifiability, by choosing appropriate tactics and patterns, and by early analysis of your designs. In the fourth edition of Software Architecture in Practice, we provide all the design and analysis tools you need to do just that.