This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

Starting CD

Migrating your system to Continuous Delivery

Introduction to CD

Continuous delivery is the ability to deliver the latest changes on-demand. CD is not build/deploy automation. It is the continuous flow of changes to the end-user with no human touchpoints between code integrating to the trunk and delivery to production. This can take the form of triggered delivery of small batches or the immediate release of the most recent code change.

CD is not a reckless throwing of random change into production. Instead, it is a disciplined team activity of relentlessly automating all of the validations required for a release candidate, improving the speed and reliability of quality feedback, and collaborating to improve the quality of the information used to develop changes.

CD is based on and extends the extreme programming practice of continuous integration. There is no CD without CI.

The path to continuous integration and continuous delivery may seem daunting to teams that are just starting out. We offer this guide to getting started with a focus on outcome metrics to track progress.

CD Pipeline

Continuous Delivery is far more than automation. It is the entire cycle of identifying value, delivering the value, and verifying with the end user that we delivered the expected value. The shorter we can make that feedback loop, the better our bottom line will be.


Goals

Both CI and CD are behaviors intended to improve certain goals. CI is very effective at uncovering issues in work decomposition and testing within the team’s processes so that the team can improve them. CD is effective at uncovering external dependencies, organizational process issues, and test architecture issues that add waste and cost.

The relentless improvement of how we implement CD reduces overhead, improves quality feedback, and improves both the outcomes of the end-user and the work/life balance of the team.

CD Maturity

It has been common for organizations to apply “maturity models” to activities such as CD. However, this has been found to lead to cargo culting and aligning goals to the process instead of the outcomes. Understanding what capabilities you have and what capabilities need to be added to fully validate and operate changes are important, but the goals should always align to improving the flow of value delivery to the end-user. This requires analyzing every process from idea to delivery and identifying what should be removed, what should be automated, and how we can continuously reduce the size of changes delivered.

There should never be an understanding that we are “mature” or “immature” with delivery. We can always improve. However, there should be an understanding of what competency looks like.

Minimums

  • Each developer integrates tested changes to the trunk at least daily.
  • Changes always use the same process to deliver.
  • There are no process differences between deploying a feature or a fix.
  • There are no manual quality gates.
  • All test and production environments use the same artifact.
  • If the release cadence requires release branches, then the release branches deploy to all test environments and production.

Good

  • New work requires less than 2 days from start to delivery
  • All changes deliver from the trunk
  • The time from committing change and delivery to production is less than 60 minutes
  • Less than 5% of changes require remediation
  • The time to restore service is less than 60 minutes.

Continuous Integration

This working agreement for CI focuses on developing teamwork and delivering quality outcomes while removing waste.

  • Branches originate from the trunk.
  • Branches are deleted in less than 24 hours.
  • Changes must be tested and not break existing tests before merging to the trunk.
  • Changes are not required to be “feature complete”.
  • Helping the team complete work in progress (code review, pairing) is more important than starting new work.
  • Fixing a broken build is the team’s highest priority.

Desired outcomes:

Continuous Delivery/Deploy


While implementation is contextual to the product, there are key steps that should be done whenever starting the CD journey.

  • Value Stream Map: This is a standard Lean tool to make visible the development process and highlight any constraints the team has. This is a critical step to begin improvement. Build a road map of the constraints and use a disciplined improvement process to remove the constraints.
  • Align to the Continuous Integration team working agreement and use the impediments to feed the team’s improvement process.
  • We always branch from Trunk.
  • Branches last less than 24 hours.
  • Changes must be tested and not break existing tests.
  • Changes are not required to be “feature complete”.
  • Code review is more important than starting new work.
  • Fixing a broken build is the team’s highest priority.
  • Build and continuously improve a single CD automated pipeline for each repository. There should only be a single configuration for each repository that will deploy to all test and production environments.

A valid CD process will have only a single method to build and deploy any change. Any deviation for emergencies indicates an incomplete CD process that puts the team and business at risk and must be improved.


Pipeline

Focus on hardening the pipeline. Its job is to block bad changes. The team’s job is to develop its ability to do that. Only use the emergency process. If a process will not be used to resolve a critical outage, it should not be happening in the CD pipeline.

Integrate outside the pipeline. Virtualize inside the pipeline. Direct integration is not a good testing method for validating behavior because the data returned is not controlled. It IS a good way to validate service mocks. However, if done in the pipeline it puts fixing production at risk if the dependency is unavailable.

There should be one or fewer stage gates. Until release and deploy are decoupled, one approval for production. No other stage gates.

Developers are responsible for the full pipeline. No handoffs. Handoffs cause delays and dilute ownership. The team owns its pipelines and the applications they deploy from birth to death.

Short CI Cycle Time

CI cycle time should be less than 10 minutes from commit to artifact creation. CD cycle time should be less than 60 minutes from commit to Production.

Integrate outside the pipeline. Virtualize inside the pipeline

Direct integration to stateful dependencies (end-to-end testing) should be avoided in the pipeline. Tests in the pipeline should be deterministic and the larger the number of integration points the more difficult it is to manage state and maintain determinism. It is a good way to validate service mocks. However, if done in the pipeline it puts fixing production at risk if the dependency is unstable/unavailable.

All test automation pre-commit

Tests should be co-located with the system under test and all acceptance testing should be done by the development team. The goal is not 100% coverage. The goal is efficient, fast, effective testing.

No manual steps There should be no manual intervention after the code is integrated into the trunk. Manual steps inject defects.


Tips

Use trunk merge frequency, development cycle time, and delivery frequency to uncover pain points. The team has complete control merge frequency and development cycle time and can uncover most issues by working to improve those two metrics.

Make sure to keep all metrics visible and refer to them often to help drive the change.

See CD best practices and CD Roadblocks for more tips on effectively introducing CICD improvements to your team processes.


References


1 - Common Blockers

The following are very frequent issues that teams encounter when working to improve the flow of delivery.

Work Breakdown

Stories without testable acceptance criteria

All stories should be defined with declarative and testable acceptance criteria. This reduces the amount of waiting and rework once coding begins and enables a much smoother testing workflow.

Acceptance criteria should define “done” for the story. No behavior other than that specified by the acceptance criteria should be implemented. This ensures we are consistently delivering what was agreed to.

Stories too large

It’s common for teams using two week sprints to have stories that require five to ten days to complete. Large stories hide complexity, uncertainty, and dependencies.

  • Stories represent the smallest user observable behavior change.
  • To enable rapid feedback, higher quality acceptance criteria, and more predictable delivery, Stories should require no more than two days for a team to deliver.

No definition of “ready”

Teams should have a working agreement about the definition of “ready” for a story or task. Until the team agrees it has the information it needs, no commitments should be made and the story should not be added to the “ready” backlog.

Definition of Ready

- Story
  - Acceptance criteria aligned with the value statement agreed to and understood.
  - Dependencies noted and resolution process for each in place
  - Spikes resolved.

- Sub-task
  - Contract changes documented
  - Component acceptance tests defined

No definition of “Done”

Having an explicit definition of done is important to keeping WIP low and finishing work.

Definition of Done

- Sub-task
  - Acceptance criteria met
  - Automated tests verified
  - Code reviewed
  - Merged to Trunk
  - Demoed to team
  - Deployed to production

- Story
  - PO Demo completed
  - Acceptance criteria met
  - All tasks "Done"
  - Deployed to production

Team Workflow

Assigning tasks for the sprint

Work should always be pulled by the next available team member. Assigning tasks results in each team member working in isolation on a task list instead of the team focusing on delivering the next high value item. It also means that people are less invested in the work other people are doing. New work should be started only after helping others complete work in progress.

Co-dependant releases

Multi-component release trains increase batch size and reduce delivered quality. Teams cannot improve efficiency if they are constantly waiting. Handle dependencies with code, do not manage them with process. If you need a person to coordinate releases, things are seriously broken.

Handoffs to other teams

If the normal flow of work requires waiting on another team then batch sizes increase and quality is reduced. Teams should be organized so they can deliver their work without coordinating outside the team.

Early story refining

As soon as we decide a story has been refined to where we can begin developing it, the information begins to age because we will never fully capture everything we decided on. The longer a story is “ready” before we being working, the less context we retain from the conversation. Warehoused stories age like milk. Limit the inventory and spend more time on delivering current work.

Manual test as a stage gate

In this context, a test is a repeatable, deterministic activity to verify the releasability of the system. There are manual activities related to exploration of edge cases and how usable the application is for the intended consumer, but these are not tests.

There should be no manual validation as a step before we deploy a change. This includes, but is not limited to manual acceptance testing, change advisory boards (CAB), and manual security testing.

Meaningless retrospectives

Retrospectives should be metrics driven. Improvement items should be treated as business features.

Hardening / Testing / Tech Debt Sprints

Just no. These are not real things. Sprints represent work that can be delivered to production.

Moving “resources” on and off teams to meet “demand”

Teams take time to grow, they cannot be “constructed”. Adding or removing anyone from a team lowers the team’s maturity and average problem space expertise. Changing too many people on a team reboots the team.

One delivery per sprint

Sprints are planning increments, not delivery increments. Plan what will be delivered daily during the sprint.

Skipping demo

If the team has nothing to demo, demo that. Never skip demo.

Committing to distant dates

Uncertainty increases with time. Distant deliverables need detailed analysis.

Not committing to dates

Commitments drive delivery. Commit to the next Minimum Viable Feature.

Velocity as a measure of productivity

Velocity is planning metric. “We can typically get this much done in this much time.” It’s an estimate of relative capacity for new work that tends to change over time and these changes don’t necessarily indicate a shift in productivity. It’s also an arbitrary measure that varies wildly between organizations, teams and products. There’s no credible means of translating it into a normalized figure that can be used for meaningful comparison.

By equating velocity with productivity there is created an incentive to optimize velocity at the expense of developing quality software.


CD Anti-Patterns

Work Breakdown

Issue Description Good Practice
Unclear requirements Stories without testable acceptance criteria Work should be defined with acceptance tests to improve clarity and enable developer driven testing.
Long development Time Stories take too long to deliver to the end user Use BDD to decompose work to testable acceptance criteria to find smaller deliverables that can be completed in less than 2 days.

Workflow Management

Issue Description Good Practice
Rubber band scope Scope that keeps expanding over time Use BDD to clearly define the scope of a story and never expand it after it begins.
Focusing on individual productivity Attempting to manage a team by reporting the “productivity” of individual team members. This is the fastest way to destroy teamwork. Measure team efficiency, effectiveness, and morale
Estimation based on resource assignment Pre-allocating backlog items to the people based on skill and hoping that those people do not have life events. The whole team should own the team’s work. Work should be pulled in priority sequence and the team should work daily to remove knowledge silos.
Meaningless retrospectives Having a retrospective where the outcome does not results in team improvement items. Focus the retrospective on the main constraints to daily delivery of value.
Skipping demo No work that can be demoed was completed. Demo the fact that no work is ready to demo
No definition of “Done” or “Ready” Obvious Make sure there are clear entry gates for “ready” and “done” and that the gates are applied without exception
One or fewer deliveries per sprint The sprint results in one or fewer changes that are production ready Sprints are planning increments, not delivery increments. Plan what will be delivered daily during the sprint. Uncertainty increases with time. Distant deliverables need detailed analysis.
Pre-assigned work Assigning the list of tasks each person will do as part of sprint planning. This results in each team member working in isolation on a task list instead of the team focusing on delivering the next high value item. The whole team should own the team’s work. Work should be pulled in priority sequence and the team should work daily to remove knowledge silos.

Teams

Issue Description Good Practice
Unstable Team Tenure People are frequently moved between teams Teams take time to grow. Adding or removing anyone from a team lowers the team’s maturity and average expertise in the solution. Be mindful of change management
Poor teamwork Poor communication between team members due to time delays or “expert knowledge” silos Make sure there is sufficient time overlap and that specific portions of the system are not assigned to individuals
Multi-team deploys Requiring more than one team to deliver synchronously reduces the ability to respond to production issues in a timely manner and delays delivery of any feature to the speed of he slowest teams. Make sure all dependencies between teams are handled in ways that allow teams to deploy independently in any sequence.

Testing Process

Issue Description Good Practice
Outsourced testing Some or all of acceptance testing performed by a different team or an assigned subset of the product team. Building in the quality feedback and continuously improving the same is the responsibility of the development team.
Manual testing Using manual testing for functional acceptance testing. Manual tests should only be used for things that cannot be automated. In addition, manual tests should not be blockers to delivery but should be asynchronous validations.

2 - Pipeline & Application Architecture

Whenever teams or areas want to improve their ability to deliver, there is a recommended order of operations to ensure the improvement is effective. This value stream improvement journey’s goal is to provide the steps and guide you to good implementation practices.

Prerequisite: Please review the CD Getting Started guide for context.

1. Build a Deployment Pipeline

Before any meaningful improvement can happen, the first constraint must be cleared. We need to make sure there is a single, automated deployment pipeline to production. Human intervention after the code is integrated should be limited to approving stage gates to trigger automation where needed. A well-architected pipeline will build an artifact once and deploy that artifact to all required test environments for validation and deliver changes safely to production. It will also trigger all of the tests and provide rapid feedback as near the source of failure as possible. This is critical for informing the developer who created the defect so that they have the chance to learn the reasons the defect was created and prevent future occurrences.

Entangled Architecture - Requires Remediation

Entangled Architecture

With an entangled architecture, there is no clear ownership of individual components or their quality. Every team could cause a defect anywhere in the system because they are not working within product boundaries. The pipeline’s quality signal will be delayed compared to better-optimized team architectures. When a defect is found, it will require effort to identify which team created the defect and a multi-team effort to improve the development process to prevent regression. Continuous delivery is difficult with this architecture.

The journey to CD begins with each team executing continuous integration on a team branch and those branches are integrated automatically into a master CI flow daily.

Multi-team Branching

Any breaks in the pipeline should be addressed immediately by the team who owns the branch.

Common Entangled Practices

Team Structure: Feature teams focused on cross-cutting deliverables instead of product ownership and capability expertise.

**Development Process: Long-lived feature branches integrated after features are complete

Branching: Team branches with each team working towards CI on their branch and daily integration of team branches to the trunk that re-runs the team-level tests.

Inverted Test Pyramid: The “ice cream cone testing” anti-pattern is common. However, the teams should be focusing on improving the quality feedback and engineering tests that alert earlier in the build cycle.

Pipeline: Establishing reliable build/deploy automation is a high priority.

Deploy Cadence / Risk: Delivery cadence in this architecture tends to be extended. This in turn leads to large code change delta and high risk.

Improvement Plan

Find the architectural boundaries in the application that can be used to divide sub-systems between the teams to create product teams. This step will realign the teams to a tightly coupled architecture with defined ownership, will improve quality outcomes, and allow them to further decouple the system using the Strangler](https://martinfowler.com/bliki/StranglerFigApplication.html) process


Tightly Coupled Architecture - Transitional

coupled pipelines

With tightly coupled architecture, changes in one portion of the application can cause unexpected changes in another portion of the application. It’s quite common for even simple changes to take days or weeks of analysis to verify the implications of the change.

Tightly coupled applications have sub-assemblies assigned to product teams along logical application boundaries. This enables each team to establish a quality signal for their components and have the feedback required for improving their quality process. This architecture requires a more complicated integration pipeline to make sure each of the components can be tested individually and as a larger application. Simplifying the pipelines and decoupling the application will result in higher quality with less overhead.

Common Tightly Coupled Practices

Team Structure: Product teams focused on further de-coupling sub-systems

Development Process: Continuous integration. Small, tested changes are applied to the trunk as soon as complete on each product team. In addition, a larger CI pipeline is required to frequently run larger tests on the integrated system, at least once per day.

Branching: Because CI requires frequent updates to the trunk, Trunk-Based Development](https://trunkbaseddevelopment.com) is used for CI.

Developer Driven Testing: The team is responsible for architecting and continuously improving a suite of tests that give rapid feedback on quality issues. The team is also responsible for the outcomes of poor testing, such as L1 support. This is a critical feedback loop for quality improvement.

Pipeline: CI pipeline working to progress to continuous delivery.

Deploy Cadence / Risk: Deliveries can be more frequent. Risk is inversely proportional to delivery frequency.

Improvement Plan

  1. As more changes are needed, the team continues extracting independent domain services](https://www.amazon.com/Implementing-Domain-Driven-Design-Vaughn-Vernon/dp/0321834577) with well-defined APIs
  2. For infrequently changed portions of the application that are poorly tested, re-writing may result in lost business capabilities. Wrapping these components in an API without re-architecting may be a better solution.

Loosely Coupled Architecture - Goal

With a loosely coupled architecture, components are delivered independently of each other in any sequence. This reduces complexity and improves quality feedback loops. This not only relies on clean separations of teams and sub-assemblies but also on mature testing practices that include the use of virtual services to verify integration.

It’s critical when planning to decompose to smaller services that Domain Driven Design is used to inform service boundaries, value objects, and team ownership. Services should use good micro-service design patterns

Once we have built our production deployment pipeline, the next most critical constraint to address is the trustworthiness of our tests.

Common Loosely Coupled Practices

Team Structure: Product teams maintain independent components with well-defined APIs.

Development Process: Continuous integration. Small, tested changes are applied to the trunk as soon as complete on each product team.

Branching: Because CI requires frequent updates to the trunk, Trunk-Based Development](https://trunkbaseddevelopment.com) is used for CI.

Developer Driven Testing: The team is responsible for architecting and continuously improving a suite of tests that give rapid feedback on quality issues. The team is also responsible for the outcomes of poor testing, such as L1 support. This is a critical feedback loop for quality improvement.

Pipeline: One or more CD pipelines that are independently deployable at any time in any sequence.

Deploy Cadence / Risk: Deliveries can occur on demand or immediately after being verified by the pipeline. Risk is inversely proportional to delivery frequency.

2. Stabilize the Quality Signal

Establishing a production pipeline allows us to evaluate and improve our quality signal. Quality gates should be designed to inform the team of poor quality as close to the source as possible. This goal will be disrupted by unstable tests.

Remediating Test Instability

Unstable test results will create a lack of trust in the test results and encourage bypassing test failure. To correct this:

  • Remove flaky tests from the pipeline to ensure that tests in the pipeline are trusted by the team
  • Identify the causes for test instability and take corrective action
    • If the test can be stabilized and provides value, correct it and move it back into the pipeline
    • If it cannot be stabilized but is required, schedule it to run outside the pipeline
    • If not required, remove it

In general, bias should be towards testing enough, but not over-testing. Tracking the duration of the pipeline and enacting a quality gate for maximum pipeline duration (from PR merge to production) is a good way to keep testing efficient.

After stabilizing the quality signal, we can track where most of the defects are detected and the type of defects they are. Start tracking the trends for the number of defects found in each environment and the root cause distribution of the defects to inform the test suite improvement plan. Then focus the improvements on moving the majority of defect detection closer to the developer. The ultimate goal is for most defects to be trapped in the developer’s environment and not leak into the deployment pipeline.

3. Continuous Improvement

After removing noise from the quality signal, we need to find and remove more waste on a continuous basis. We start by mapping the deployment process from coding to production delivery and identifying the choke points that are constraining the entire flow. The process for doing this and the effectiveness are documented in Goldratt’s “Theory of Constraints” (TOC). The TOC states that the entire system is constrained by one constraint and improvement of the system will only be effective once that constraint is resolved.

  1. Identify the system constraint.
  2. Decide how to exploit the system constraint.
  3. Subordinate everything else to the above decisions.
  4. Elevate the constraint.
  5. If, in the previous steps, a constraint has been broken, go back to step one but do not allow the inertia to cause a system constraint.

Some common constraints are:

  • Resource Constraints - resources such as the number of people who can perform the task, access to environments, etc. which block the flow based on its limited capacity for the desired outcomes.
  • Policy Constraints - policies, practices or metrics that artificially impede flow due to their poor alignment with the overall performance of the system.

Working daily to relentlessly remove constraints is the most important work a team can do. Doing so means they are constantly testing their improved delivery system by delivering value and constantly improving their ability to do so. Quality, predictability, stability, and speed all improve.

References

Title Author
Accelerate Forsgren, Humble, & Kim - 2018
Engineering the Digital Transformation Gruver - 2019
A Practical Approach to Large-Scale Agile Development:
How HP Transformed LaserJet FutureSmart Firmware
Gruver et al - 2012
Theory of Constraints Goldratt