Secrets to Mastering High-Frequency Production Deployments

Article

Secrets to Mastering High-Frequency Production Deployments

December 16, 2024

|

by Shrikant Vashishtha

Deployment Frequency

Continuous Deployment

Feature Toggle

Customer Centric Approach

DORA Metrics

Frequent deployments, as emphasized by DORA metrics, reflect a team’s ability to deliver new features or fixes quickly and reliably. This approach prioritizes continuous delivery, where small, manageable changes are moved to production often, reducing risks and enabling faster feedback.

By slicing a feature into vertical slices, teams are able to adapt quickly to customer needs and also reduce risk by improving quality. In this article, we’ll discuss strategies for increasing deployment frequency, embracing automation, and bring a culture of incremental and iterative progress to ensure that every release brings value to users.

Why Shortening the Development Cycle Accelerates Learning

What if it takes a long amount of time for anything to reach production? That amounts to in inventory in Lean terms which is waste, and will have associated cost of delay.

The development lifecycle of any software product begins with the intention to drive a change in user behavior, which enables users to solve their problems more effectively.

While this intention is noble, whether it actually works can only be determined when users interact with the feature—this is when a team learns the most. In software product development, some learning occurs during the discovery phase, and very little happens during the development and deployment phases. The majority of learning happens when the product increment reaches the users.

Sometimes even with the best of intentions, nobody uses it or it's used by very few users. In some other cases, people find it challenging to use it for their use cases and look for alternatives.

That’s why it’s important to remain humble about the expectations you may have for what you are building, fail early, and learn quickly. What if it takes a long amount of time for anything to reach production? That amounts to inventory in Lean terms which is waste, and will have an associated cost of delay.

To reduce this associated cost of delay, it's important to shorten the development cycle, apply Continuous Deployment, and instead spend more time in measuring the impact of the change and learn from it.

When we talk about shortening the development cycle, we mean that any item entering the development phase should be ready for production within a couple of days. If a backlog item is expected to take longer, it needs to be broken down further.

Many organizations in today’s world are already on this journey, including traditionally conservative ones, such as banks. For instance, the IT team at one of the world’s top 10 largest banks has a mandate to deliver at least one production deployment per person per week across the board. That translates into 10 deployments for a 10 person team in a week, and 520 deployments in a year.

Difference Between Viable and Valuable Feature

One of the most significant impediments teams face when striving to move production quickly is their interpretation of what exactly should move to production.

Typically, small backlog items can be deployed and released to users quickly. However, when a backlog item is large and may take a few weeks to complete, some teams hesitate to move to production until the entire feature is finished. This happens even when they have the option to break the feature into smaller, independent slices and develop and deploy them incrementally.

In such a context, it’s important to recognize that a small vertical slice of a feature can be valuable to deploy but may not yet be viable and ready for public release.

For example, while a login feature on an e-commerce website is valuable in itself, releasing the site with only a login feature would not make sense or be viable from a business perspective. To create a viable business case, the website would need additional features, such as selecting products, adding them to a shopping cart, and processing payments.

In a similar fashion vertical slices of a feature can be valuable in themselves but may not be viable yet. This means a team in such a case should be able to deploy them to production but may not want to release them to the public. Teams achieve this by using a feature toggle.

Essentially all known products do that. In any current installation of WhatsApp for instance, there may be many features which are deployed to production but not yet released.

A question might arise: if a user derives value only when a feature is fully viable, why break it into small vertical slices instead of delivering it as a single, consolidated deployment at the end? The answer lies in the foundation of Continuous Integration.

If it’s hard to do something, you are not doing it often enough!
If the delta is small, so is the risk.

In a large consolidated deployment the risk may emerge from many different directions. Through small vertical slices the delta of change becomes small. As the delta becomes small, so becomes risk as well. That should explain the reason for doing small and frequent production deployments through vertical slices.

Essentially teams should get over from the mindset of moving to production only when everything is complete. Instead they should slice a feature into small vertical slices and should move them to production every couple of days.

Essential Elements of High Frequency Deployment

Let's look at what all it takes to move towards high frequency deployment.

The Mindset - It's all about customer need/problem, not about the feature you are developing

The ultimate goal of any work undertaken by a software development team is to enhance the customer experience and effectively address user challenges.

Despite the best of intentions, a hypothesis may sometimes fail. It’s crucial for a team to adopt an attitude of letting go and move forward if something doesn’t work, rather than taking it personally. Sometimes they should be open to deleting a feature as well.

At the end of the day,
Attaining a high Release frequency is not the goal. Releasing frequently with small increments is a way to deliver good products.

Breaking Feature into Vertical Slices

As evident in the discussion around viable versus valuable, it’s crucial to deliver small, valuable slices to production rather than waiting to deploy an entire feature, as this helps in reducing risk.

The slices should be vertical (a complete, functional piece of the system that spans all layers of the technology stack, delivering end-to-end functionality) and should be so small so that they could move to production in a couple of days. The slicing exercise can keep going to nth level to achieve this sort of size.

While slicing, it's important to keep in mind the intent of the feature, i.e. a problem it aims to solve.

In the picture above, for instance, the goal is to commute from point A to point B quickly. Various iterative implementations aim to reach that goal and help the customer get from point A to point B quickly. The purpose of these iterative increments is to serve the customer quickly, gather feedback throughout the process, and continue iterating consistently strive to provide the best possible user experience.

Continuous Focus on Risk Reduction

One of the key reasons why people hesitate to release more often is the risk involved in deploying any release. For any small release risks come from many different directions. For an enterprise, people acknowledge and mitigate these risks in many different ways. Here are some example risks and their mitigations which teams focus on as part of their release process.

Functional correctness of a feature is taken care of through clearly defined acceptance criteria followed by automated tests, static code analysis and exploratory testing. For any deployment, automated tests become essential and important to reduce the continuously expanding regression effort.
Post Deployment Verification ensures that the change meets the requirements without impacting the existing functionalities.
Automated deployments ensure deployments are reliable, repeatable and resilient. If a deployment fails then it can be rolled back or toggled off quickly, avoiding any service disruption. It's important to define the rollback process of each change, and the process should be tested.
Automated security scans as part of CI build help identify and remediate security vulnerabilities, protecting production from cybersecurity risks.
Extensive monitoring and observability is applied through pre-deployment verification and post-deployment monitoring to monitor health metrics, such as error rates, latency, and system utilization.
Teams build for reliability and resilience through immutable infrastructure in which deployment of a service happens as a new instance instead of modifying live instances.
Incremental deployment strategies are applied through blue-green deployments (maintain two environments i.e. blue as live, green as the new deployment, and switch traffic to green only when the deployment is validated) and through canary deployments (gradually release updates to a small subset of users, monitor performance, and scale up).

Customer centricity

Drawing from experience with many teams, it really helps to build trust with customers through reacting quickly instead of not delivering for a long time.

If the teams respond to a customer quickly and solve problems, they are keen to help and provide a helping hand.

It's important to keep the customer in the center of any discussion a team has. It's important for teams to have regular feedback sessions with the customer at the time a feature is delivered and even post delivery as well to see how the feature is faring.

We should have interaction with the client after the delivery of the feature as it's important to receive the feedback

It’s important for teams to have regular feedback sessions with customers at the time a feature is delivered. Additionally, establishing a post-delivery feedback loop helps in understanding why something delivered is not being used and identifying areas where further support might be needed.

Take Ops Help to Improve Your Release

Ops helps in self-diagnosing tools to improve your release

For an environment where breaking production is okay supported by quick roll back, 5-10 min break break in prod, rollback quickly should not a serious issue
Some Ops tools can help in making a release painless in micro-services context
- Each service endpoint should be able to tell if it's healthy or not
- Gather this info in a single place for multiple services to see if the instances are healthy or not. We should be able to identify which dependency is causing the problem.
- Someone should monitor or the team should have alerting mechanism when something goes wrong
- Run health-checks as part of prod deployment which makes a team more confident that it will break small
- Before releasing, first check and verify all production checks and see if they are healthy. Also ensure that we are currently running rollback versions.
- The deployment should have a version to which it can rollback
Run automated sanity checks (defined by the business) for the most important/critical paths of the application
Look for more automated checks for risky applications
It's helpful to use rolling upgrades. A rolling upgrade is an upgrade of a software version, performed without a noticeable down-time or other disruption of service.

Conclusion

In conclusion, the journey towards high-frequency deployments and incremental feature delivery is not merely about releasing software faster but about delivering meaningful value to users while minimizing risk.

By breaking features into small, vertical slices and maintaining a continuous focus on customer needs, teams can iteratively refine their solutions based on user feedback.

Risk is mitigated through automated testing, robust deployment strategies, and resilient monitoring systems. Ultimately, the practice of frequent, small releases empowers teams to fail early, learn quickly, and foster trust with their users. Success lies in embracing humility, customer-centricity, and relentless risk reduction in the development process.

November 8, 2024

Beyond Code: The Human Element Driving High-Performing Remote Teams

Discover how high-performing remote teams succeed beyond technical skills. A unique “Glue” role fosters team cohesion among freelancers, bridging gaps and building trust for smoother collaboration. See why empathy and shared interests matter as much ...

by Shrikant Vashishtha

Learn more

October 15, 2024

Applying Lean Startup Concepts in Existing Product Development

Learn how established companies can innovate like startups! Applying Lean Startup methods helps teams quickly test ideas, save resources, and build products customers truly want. Discover the secrets of MVPs, rapid testing, and agile learning to stay...

by Shrikant Vashishtha

Learn more

September 24, 2024

The Secret Sauce to Seamless Cross-Platform Development Using Jira

Unlock the secret to faster cross-platform development! Discover how top teams use Jira to streamline workflows, avoid costly delays, and boost productivity with one simple strategy. Don’t let backend issues slow you down – find out how to fix it now...

by Shrikant Vashishtha

Learn more

Article