This is the multi-page printable view of this section.
Click here to print.
Return to the regular view of this page.
Starting CD
Migrating your system to Continuous Delivery
Introduction to CD
Continuous delivery is the ability to deliver the latest changes on-demand. CD is not build/deploy automation. It is the continuous flow of changes to the end-user with no human touchpoints between code integrating to the trunk and delivery to production. This can take the form of triggered delivery of small batches or the immediate release of the most recent code change.
CD is not a reckless throwing of random change into production. Instead, it is a disciplined team activity of relentlessly automating all of the validations required for a release candidate, improving the speed and reliability of quality feedback, and collaborating to improve the quality of the information used to develop changes.
CD is based on and extends the extreme programming practice of continuous integration. There is no CD without CI.
The path to continuous integration and continuous delivery may seem daunting to teams that are just starting out. We offer this guide to getting started with a focus on outcome metrics to track progress.
Continuous Delivery is far more than automation. It is the entire cycle of identifying value, delivering the value, and verifying
with the end user that we delivered the expected value. The shorter we can make that feedback loop, the better our bottom line will
be.
Goals
Both CI and CD are behaviors intended to improve certain goals. CI is very effective at uncovering issues in work decomposition and testing within the team’s processes so that the team can improve them. CD is effective at uncovering external dependencies, organizational process issues, and test architecture issues that add waste and cost.
The relentless improvement of how we implement CD reduces overhead, improves quality feedback, and improves both the outcomes of the end-user and the work/life balance of the team.
CD Maturity
It has been common for organizations to apply “maturity models” to activities such as CD. However, this has been found to lead to cargo culting and aligning goals to the process instead of the outcomes. Understanding what capabilities you have and what capabilities need to be added to fully validate and operate changes are important, but the goals should always align to improving the flow of value delivery to the end-user. This requires analyzing every process from idea to delivery and identifying what should be removed, what should be automated, and how we can continuously reduce the size of changes delivered.
There should never be an understanding that we are “mature” or “immature” with delivery. We can always improve. However, there should be an understanding of what competency looks like.
Minimums
- Each developer integrates tested changes to the trunk at least daily.
- Changes always use the same process to deliver.
- There are no process differences between deploying a feature or a fix.
- There are no manual quality gates.
- All test and production environments use the same artifact.
- If the release cadence requires release branches, then the release branches deploy to all test environments and production.
Good
- New work requires less than 2 days from start to delivery
- All changes deliver from the trunk
- The time from committing change and delivery to production is less than 60 minutes
- Less than 5% of changes require remediation
- The time to restore service is less than 60 minutes.
Continuous Integration
This working agreement for CI focuses on developing teamwork and delivering quality outcomes while removing waste.
- Branches originate from the trunk.
- Branches are deleted in less than 24 hours.
- Changes must be tested and not break existing tests before merging to the trunk.
- Changes are not required to be “feature complete”.
- Helping the team complete work in progress (code review, pairing) is more important than starting
new work.
- Fixing a broken build is the team’s highest priority.
Desired outcomes:
Continuous Delivery/Deploy
Recommended Practices
While implementation is contextual to the product, there are key
steps that should be done whenever starting the CD journey.
- Value Stream Map: This is a standard Lean tool to make visible
the development process and highlight any constraints the team has. This is a
critical step to begin improvement.
Build a road map of the constraints and use a disciplined improvement process
to remove the constraints.
- Align to the Continuous Integration team working agreement and use the
impediments to feed the team’s improvement process.
- We always branch from Trunk.
- Branches last less than 24 hours.
- Changes must be tested and not break existing tests.
- Changes are not required to be “feature complete”.
- Code review is more important than starting new work.
- Fixing a broken build is the team’s highest priority.
- Build and continuously improve a single CD automated pipeline for each
repository. There should only be a single configuration for each repository
that will deploy to all test and production environments.
A valid CD process will have only a single method to build and deploy any
change. Any deviation for emergencies indicates an incomplete CD process that
puts the team and business at risk and must be improved.
Pipeline
Focus on hardening the pipeline. Its job is to block bad changes. The team’s job is to develop its ability to do that. Only use the emergency process. If a process will not be used to resolve a critical outage, it should not be happening in the CD pipeline.
Integrate outside the pipeline. Virtualize inside the pipeline. Direct integration is not a good testing method for validating behavior because the data returned is not controlled. It IS a good way to validate service mocks. However, if done in the pipeline it puts fixing production at risk if the dependency is unavailable.
There should be one or fewer stage gates. Until release and deploy are decoupled, one approval for production. No other stage gates.
Developers are responsible for the full pipeline. No handoffs. Handoffs cause delays and dilute ownership. The team owns its pipelines and the applications they deploy from birth to death.
Short CI Cycle Time
CI cycle time should be less than 10 minutes from commit to artifact creation. CD cycle time should be less than 60 minutes from commit to Production.
Integrate outside the pipeline. Virtualize inside the pipeline
Direct integration to stateful dependencies (end-to-end testing) should be avoided in the pipeline. Tests in the pipeline should be deterministic and the larger the number of integration points the more difficult it is to manage state and maintain determinism. It is a good way to validate service mocks. However, if done in the pipeline it puts fixing production at risk if the dependency is unstable/unavailable.
All test automation pre-commit
Tests should be co-located with the system under test and all acceptance testing should be done by the development team. The goal is not 100% coverage. The goal is efficient, fast, effective testing.
No manual steps
There should be no manual intervention after the code is integrated into the trunk. Manual steps inject defects.
Tips
Use trunk merge frequency,
development cycle time, and
delivery frequency to uncover pain points. The team has
complete control merge frequency and development cycle time and can
uncover most issues by working to improve those two metrics.
Make sure to keep all metrics visible and refer to them often to help drive the
change.
See CD best practices and CD Roadblocks for more tips on effectively introducing CICD improvements to your team processes.
References
1 - Common Blockers
The following are very frequent issues that teams encounter when working to improve the flow of delivery.
Work Breakdown
Stories without testable acceptance criteria
All stories should be defined with declarative and testable acceptance criteria. This reduces the amount
of waiting and rework once coding begins and enables a much smoother testing workflow.
Acceptance criteria should define “done” for the story. No behavior other than that specified by the acceptance
criteria should be implemented. This ensures we are consistently delivering what was agreed to.
Stories too large
It’s common for teams using two week sprints to have stories that require five to ten days to complete. Large stories hide complexity, uncertainty, and dependencies.
- Stories represent the smallest user observable behavior change.
- To enable rapid feedback, higher quality acceptance
criteria, and more predictable delivery, Stories should require no more than two days for a team to deliver.
No definition of “ready”
Teams should have a working agreement about the definition of “ready” for a story or task. Until the team agrees it has
the information it needs, no commitments should be made and the story should not be added to the “ready” backlog.
Definition of Ready
- Story
- Acceptance criteria aligned with the value statement agreed to and understood.
- Dependencies noted and resolution process for each in place
- Spikes resolved.
- Sub-task
- Contract changes documented
- Component acceptance tests defined
No definition of “Done”
Having an explicit definition of done is important to keeping WIP low and finishing work.
Definition of Done
- Sub-task
- Acceptance criteria met
- Automated tests verified
- Code reviewed
- Merged to Trunk
- Demoed to team
- Deployed to production
- Story
- PO Demo completed
- Acceptance criteria met
- All tasks "Done"
- Deployed to production
Team Workflow
Assigning tasks for the sprint
Work should always be pulled by the next available team member. Assigning tasks results in each team member working in isolation on a task list instead of the team
focusing on delivering the next high value item. It also means that people are less invested in the work other people
are doing. New work should be started only after helping others
complete work in progress.
Co-dependant releases
Multi-component release trains increase batch size and reduce delivered quality. Teams cannot improve efficiency if they
are constantly waiting. Handle dependencies with code, do not manage them with process. If you need a person to
coordinate releases, things are seriously broken.
Handoffs to other teams
If the normal flow of work requires waiting on another team then batch sizes increase and quality is reduced. Teams
should be organized so they can deliver their work without coordinating outside the team.
Early story refining
As soon as we decide a story has been refined to where we can begin developing it, the information begins to age because
we will never fully capture everything we decided on. The longer a story is “ready” before we being working, the less
context we retain from the conversation. Warehoused stories age like milk. Limit the inventory and spend more time on
delivering current work.
Manual test as a stage gate
In this context, a test is a repeatable, deterministic activity to verify the releasability of the system. There are
manual activities related to exploration of edge cases and how usable the application is for the intended consumer, but these
are not tests.
There should be no manual validation as a step before we deploy a change. This includes, but is not limited to manual
acceptance testing, change advisory boards (CAB), and manual security testing.
Meaningless retrospectives
Retrospectives should be metrics driven. Improvement items should be treated as business features.
Hardening / Testing / Tech Debt Sprints
Just no. These are not real things. Sprints represent work that can be
delivered to production.
Moving “resources” on and off teams to meet “demand”
Teams take time to grow, they cannot be “constructed”. Adding or removing anyone
from a team lowers the team’s maturity and average problem space expertise. Changing too many people on a team
reboots the team.
One delivery per sprint
Sprints are planning increments, not delivery increments. Plan what will be delivered daily during the sprint.
Skipping demo
If the team has nothing to demo, demo that. Never skip demo.
Committing to distant dates
Uncertainty increases with time. Distant deliverables need detailed analysis.
Not committing to dates
Commitments drive delivery. Commit to the next Minimum Viable Feature.
Velocity as a measure of productivity
Velocity is planning metric. “We can typically get this much done in this much time.” It’s an estimate of relative
capacity for new work that tends to change over time and these changes don’t necessarily indicate a shift in productivity. It’s
also an arbitrary measure that varies wildly between organizations, teams and products. There’s no credible means of
translating it into a normalized figure that can be used for meaningful comparison.
By equating velocity with productivity there is created an incentive to optimize velocity at the expense of developing quality software.
CD Anti-Patterns
Work Breakdown
Issue |
Description |
Good Practice |
Unclear requirements |
Stories without testable acceptance criteria |
Work should be defined with acceptance tests to improve clarity and enable developer driven testing. |
Long development Time |
Stories take too long to deliver to the end user |
Use BDD to decompose work to testable acceptance criteria to find smaller deliverables that can be completed in less than 2 days. |
Workflow Management
Issue |
Description |
Good Practice |
Rubber band scope |
Scope that keeps expanding over time |
Use BDD to clearly define the scope of a story and never expand it after it begins. |
Focusing on individual productivity |
Attempting to manage a team by reporting the “productivity” of individual team members. This is the fastest way to destroy teamwork. |
Measure team efficiency, effectiveness, and morale |
Estimation based on resource assignment |
Pre-allocating backlog items to the people based on skill and hoping that those people do not have life events. |
The whole team should own the team’s work. Work should be pulled in priority sequence and the team should work daily to remove knowledge silos. |
Meaningless retrospectives |
Having a retrospective where the outcome does not results in team improvement items. |
Focus the retrospective on the main constraints to daily delivery of value. |
Skipping demo |
No work that can be demoed was completed. |
Demo the fact that no work is ready to demo |
No definition of “Done” or “Ready” |
Obvious |
Make sure there are clear entry gates for “ready” and “done” and that the gates are applied without exception |
One or fewer deliveries per sprint |
The sprint results in one or fewer changes that are production ready |
Sprints are planning increments, not delivery increments. Plan what will be delivered daily during the sprint. Uncertainty increases with time. Distant deliverables need detailed analysis. |
Pre-assigned work |
Assigning the list of tasks each person will do as part of sprint planning. This results in each team member working in isolation on a task list instead of the team focusing on delivering the next high value item. |
The whole team should own the team’s work. Work should be pulled in priority sequence and the team should work daily to remove knowledge silos. |
Teams
Issue |
Description |
Good Practice |
Unstable Team Tenure |
People are frequently moved between teams |
Teams take time to grow. Adding or removing anyone from a team lowers the team’s maturity and average expertise in the solution. Be mindful of change management |
Poor teamwork |
Poor communication between team members due to time delays or “expert knowledge” silos |
Make sure there is sufficient time overlap and that specific portions of the system are not assigned to individuals |
Multi-team deploys |
Requiring more than one team to deliver synchronously reduces the ability to respond to production issues in a timely manner and delays delivery of any feature to the speed of he slowest teams. |
Make sure all dependencies between teams are handled in ways that allow teams to deploy independently in any sequence. |
Testing Process
Issue |
Description |
Good Practice |
Outsourced testing |
Some or all of acceptance testing performed by a different team or an assigned subset of the product team. |
Building in the quality feedback and continuously improving the same is the responsibility of the development team. |
Manual testing |
Using manual testing for functional acceptance testing. |
Manual tests should only be used for things that cannot be automated. In addition, manual tests should not be blockers to delivery but should be asynchronous validations. |
2 - Pipeline & Application Architecture
Whenever teams or areas want to improve their ability to deliver, there is a recommended order of operations to ensure the
improvement is effective. This value stream improvement journey’s goal is to provide the steps and guide you to good implementation
practices.
Prerequisite: Please review the CD Getting Started guide for context.
1. Build a Deployment Pipeline
Before any meaningful improvement can happen, the first constraint must be cleared. We need to make sure there is a single,
automated deployment pipeline to production. Human intervention after the code is integrated should be limited to approving
stage gates to trigger automation where needed.
A well-architected pipeline will build an artifact once and deploy that artifact to all required test environments for validation
and deliver changes safely to production.
It will also trigger all of the tests and provide rapid feedback as near the source of failure as possible. This is critical for
informing the developer who created the defect so that they have the chance to learn the reasons the defect was created and prevent
future occurrences.
With an entangled architecture, there is no clear ownership of individual components or their quality. Every team could cause a
defect anywhere in the system because they are not working within product boundaries. The pipeline’s quality signal will
be delayed compared to better-optimized team architectures. When a defect is found, it will require effort to identify
which team
created the defect and a multi-team effort to improve the development process to prevent regression. Continuous delivery
is difficult with this architecture.
The journey to CD begins with each team executing continuous
integration on a team branch and those branches are
integrated automatically into a master CI flow daily.
Any breaks in the pipeline should be addressed immediately by the team who owns the branch.
Common Entangled Practices
Team Structure: Feature teams focused on cross-cutting deliverables instead of product ownership and capability expertise.
**Development Process: Long-lived feature branches integrated after features are complete
Branching: Team branches with each team working towards CI on their branch and daily integration of team branches
to the trunk that re-runs the team-level tests.
Inverted Test Pyramid: The “ice cream cone testing” anti-pattern is
common. However, the teams should be focusing on improving the quality feedback and engineering tests that alert earlier
in the build cycle.
Pipeline: Establishing reliable build/deploy automation is a high priority.
Deploy Cadence / Risk: Delivery cadence in this architecture tends to be extended. This in turn leads to large code
change delta and high risk.
Improvement Plan
Find the architectural boundaries in the application that can be used to divide sub-systems between the
teams to create product teams. This step will realign the teams to a tightly coupled
architecture with defined ownership, will improve quality outcomes, and
allow them to further decouple the system using the Strangler](https://martinfowler.com/bliki/StranglerFigApplication.html) process
Tightly Coupled Architecture - Transitional
With tightly coupled architecture, changes in one portion of the application can cause unexpected changes in another portion of
the application. It’s quite common for even simple changes to take days or weeks of analysis to verify the implications of the
change.
Tightly coupled applications have sub-assemblies assigned to product teams along logical application boundaries. This enables each
team to establish a quality signal for their components and have the feedback required for improving their quality process. This
architecture requires a more complicated integration pipeline to make sure each of the components can be tested
individually and as a larger application. Simplifying the pipelines and decoupling the application will result in higher
quality with less overhead.
Common Tightly Coupled Practices
Team Structure: Product teams focused on further de-coupling sub-systems
Development Process: Continuous integration. Small, tested changes are applied to the trunk as soon as complete on each product team. In addition, a larger CI pipeline is required to frequently run larger tests on the
integrated system, at least once per day.
Branching: Because CI requires frequent updates to the trunk, Trunk-Based
Development](https://trunkbaseddevelopment.com) is used for CI.
Developer Driven Testing: The team is responsible for
architecting and continuously improving a suite of tests that give rapid feedback on quality issues. The team is also responsible
for the outcomes of poor testing, such as L1 support. This is a critical feedback loop for quality improvement.
Pipeline: CI pipeline working to progress to continuous delivery.
Deploy Cadence / Risk: Deliveries can be more frequent. Risk is inversely proportional to delivery frequency.
Improvement Plan
- As more changes are needed, the team continues extracting independent domain
services](https://www.amazon.com/Implementing-Domain-Driven-Design-Vaughn-Vernon/dp/0321834577) with
well-defined APIs
- For infrequently changed portions of the application that are poorly tested, re-writing may result in lost business
capabilities. Wrapping these components in an API without re-architecting may be a better solution.
Loosely Coupled Architecture - Goal
With a loosely coupled architecture, components are delivered independently of each other in any sequence. This reduces
complexity and improves quality feedback loops. This not only relies on clean separations of teams and sub-assemblies but also on mature testing practices that include the use of virtual services to verify integration.
It’s critical when planning to decompose to smaller services that Domain Driven
Design is used to inform service boundaries, value objects, and team
ownership. Services should use good micro-service design patterns
Once we have built our production deployment pipeline, the next most critical constraint to address is the trustworthiness of our
tests.
Common Loosely Coupled Practices
Team Structure: Product teams maintain independent components with well-defined APIs.
Development Process: Continuous integration. Small, tested changes are applied to the trunk as soon as complete on each product team.
Branching: Because CI requires frequent updates to the trunk, Trunk-Based
Development](https://trunkbaseddevelopment.com) is used for CI.
Developer Driven Testing: The team is responsible for
architecting and continuously improving a suite of tests that give rapid feedback on quality issues. The team is also responsible
for the outcomes of poor testing, such as L1 support. This is a critical feedback loop for quality improvement.
Pipeline: One or more CD pipelines that are independently deployable at any time in any sequence.
Deploy Cadence / Risk: Deliveries can occur on demand or immediately after being verified by the pipeline. Risk is
inversely proportional to delivery frequency.
2. Stabilize the Quality Signal
Establishing a production pipeline allows us to evaluate and improve our quality signal. Quality gates should
be designed to inform the team of poor quality as close to the source as possible. This goal will be disrupted by
unstable tests.
Unstable test results will create a lack of trust in the test results and encourage bypassing test failure. To correct this:
- Remove flaky tests from the pipeline
to ensure that tests in the pipeline are trusted by the team
- Identify the causes for test instability and take corrective action
- If the test can be stabilized and provides value, correct it and move it back into the pipeline
- If it cannot be stabilized but is required, schedule it to run outside the pipeline
- If not required, remove it
In general, bias should be towards testing enough, but not over-testing. Tracking the
duration of the pipeline and enacting a quality gate for maximum pipeline duration (from PR merge to production) is a good way to keep testing efficient.
After stabilizing the quality signal, we can track where most of the defects are detected and the type of defects they
are. Start tracking the trends for the number of defects found in each environment and the root cause distribution of
the defects to inform the test suite improvement plan. Then focus the improvements on moving the majority of defect detection closer to the developer. The ultimate goal is for most defects to be trapped in the developer’s environment and not leak into the
deployment pipeline.
3. Continuous Improvement
After removing noise from the quality signal, we need to find and remove more waste on a
continuous basis. We start by mapping the deployment process from coding to production delivery and identifying the choke points
that are constraining the entire flow. The process for doing this and the effectiveness are documented in Goldratt’s “Theory of
Constraints” (TOC). The TOC states that the entire system is constrained
by one constraint and improvement of the system will only be effective once that constraint is resolved.
- Identify the system constraint.
- Decide how to exploit the system constraint.
- Subordinate everything else to the above decisions.
- Elevate the constraint.
- If, in the previous steps, a constraint has been broken, go
back to step one but do not allow the inertia to cause a system constraint.
Some common constraints are:
- Resource Constraints - resources such as the number of people who can perform the task, access to environments, etc. which
block the flow based on its limited capacity for the desired outcomes.
- Policy Constraints - policies, practices or metrics that artificially impede flow due to their poor alignment with the overall performance of the system.
Working daily to relentlessly remove constraints is the most important work a team can do. Doing so means they are constantly
testing their improved delivery system by delivering value and constantly improving their ability to do so. Quality, predictability,
stability, and speed all improve.
References