The nightmare of any early stage founder: when you’re building traction and suddenly your product breaks. Fixing it takes time and money that you don’t have. What you do have is a lot of angry customers. This can be the end of your company. Larger companies can also face costly setbacks due to problems in their software. Malfunctions in parts of a solution that worked before are called regressions. Regressions are harder to catch, but are typically more expensive to fix.
Regression bugs can be a pain. At Squads, we help customers mitigate this risk by automating regression testing. We automate this because as the feature set grows, the regression test suite grows with it. Manually running a regression test suite, and maintaining it separately from the code base is expensive, and error prone.
In this article, I’ll outline our roadmap towards full test automation. A road always takes us from A to B. So, let’s define a starting point and an ultimate goal and then zoom in to the steps in between.
A single developer writes code. Testing is done by running the code against production data. When it works, it’s live. This is how most webshops start. This is how Facebook started. This is how my scripts for personal use are maintained. Any tech savvy computer user knows and uses this strategy for their first automation efforts. Probably most software currently in existence is maintained and run this way. There’s nothing wrong with this strategy, as long as it is not used where it doesn’t apply. If we want to improve on the rudimentary, it helps to look at the ideal situation first.
A team of developers writes code. Their communication is effective and efficient. This means interruptions are minimal, while the response to changes is quick. In this set-up the following rules should govern.
This means that in order to acquire the right to interrupt any developer, you must pay with proof that something is not working. You can extend this rule to other creative work. For example, in my work, I might be interrupted by a freelancer while editing this article because their weekly bill has not been paid. I’ve come up with the rule that is only allowed if the freelancer can show me a transaction that is older than 10 days which hasn’t been marked as paid (the test). I’m not strictly enforcing this rule, because I welcome the chance to be nice to any community member. but frequent interruptions result in lower productivity, so I have this rule to help myself and Squads. The more difficult creative work we’re doing, the more costly interruptions are. So interruptions. This can be done by this rule: show me a failing test, or you will be shut down with ‘cannot reproduce’ in seconds. This seems unfair, so we need to make a rule for developers to help future bug reporters on their way.
To make rule 1 fair, you should define a test for each feature before it is built. If this doesn’t happen, there would be no way to interrupt the developer. The organization would grind to a halt, and then there would be no point in paying developers at all anymore.This happens more often than you might think, so define a test before you start paying for development (hiring a QA specialist for this can be a great idea), and then extend it before you interrupt a dev working on that feature. It’s fine if this is initially a manual test of course, but to reach perfection, we have to improve and automate it.
Smart humans are lazy and they make mistakes in repetitive work. Computers are less likely to make mistakes in predictable work, and they are very good at repeating tasks. Once you have a well defined test, reproducing it should be cheap. I’ve almost never seen automation fail to pay off. It’s usually cheaper than we think, and once in place, it mitigates risks that are usually much more costly than we think.
Without well defined tests (rule 2), test automation (rule 3) is meaningless. Once we have well defined automated tests, we can safely go even further and aggressively automate deployment too.
We automate the tests and the deployments they are run against — it’s the only way to implement rule 3 correctly. Once deployment is automated, it’s easy to apply the automation to the production environment as well. This means that if a feature is complete enough to make all the existing and new tests green, it can be automatically merged and deployed to production.
So in a perfect setup: everything is automated, there’s a test for everything, and we’re only interrupted when something is proven to be broken.
I have never seen perfection. I think that it may not exist. But that is no good reason not to strive for continuous improvement. A utopian goal can be a good one, and I think in this case it is.
Advice: take a close look at how things are and focus on painful places where the above is not true. It’s a simple but very effective recipe.
Let’s look at the transitions from where we are, to where we want to be.
Automating the wrong thing is a disaster waiting to happen. For example if we automate deployment and commit directly to master, but we don’t introduce mandatory PR reviews, and regression testing, we’d have production outages way too often. So let’s first make things right and then automate.
The first thing we introduce is a safety net of tests. Then, we start executing those tests manually. I like to invite product owners to put demo and test scripts in their issues before planning meetings. The demo script is how they’d like to see the issue demoed during review, the test script is how they’d like the edge cases to work.
A QA specialist, acting as the product owner of the automated regression test suite, can expand edge cases from the acceptance criteria and then go back to the PO for approval before the development work starts. In this role, they are not just testing but also overseeing the quality and comprehensiveness of the tests as a whole. This emphasizes that tests are part of the design upfront. I like BDD. I think that a good designer is also a tester, and a good tester is also a designer. They are both focused on perfecting the user journey, and we should only develop things that are thought through from both a holistic and detailed perspective.
Once there is enough functionality, the manual work of testing for regressions will become more and more costly. The impact of regressions will also increase with the user base. Once the cost and risk are high enough to outweigh the cost of automation, start automating regression tests with the highest risk first. Keeping a good balance between risk and mitigation is the essence of keeping your company safe. With appropriate guardrails we can increase speed.
Once you’re safe and covered, you will want to speed up your organization’s response to change. In any market, moving faster means more success. This can be achieved in the following two ways:
Reducing the time spent on a single change,
Allowing changes to be developed in parallel,
To reduce the time spent on a change, we can automate even more. Automatic deployments allow a developer to share a url to a working version of the software, instead of sharing on their local machine, or via a video only. This reduces interruptions and helps you move towards full deployment automation.
To allow changes in parallel, we need to define smaller and more decoupled changes. We also need to decouple inside the code. This increases maintainability and makes the life of developers easier. To keep the automation working fine, we need to introduce automatic feature deployments. Inside Squads, at the time of writing, each feature branch is automatically built and deployed, then automated tests can be run against them. Once green, I (the product owner) check the functionality knowing that I don’t have to do double work if I find a regression. This makes my life easier, and my relationship with the development team stays great.
At this stage, things start to diversify per situation. In high volume B2C cases it might be cheaper to let end users do the majority of regression testing. Then the marketing department can decide if something is working based on the conversion rate. In low volume B2B cases, it might be cheaper to record automatic tests from all end users and apply them to all new features. These are two extremes on the risk vs. cost balance. Make sure that everything stays automated, and that the test: “no features used in the past 3 months by any user break,” or “our conversion doesn’t drop more than x%” is also applied automatically.
Your tests should be designed according to your business goals. Eventually it’s your stakeholders (investors, shareholders, key customers) that can define the tests that matter. The people in your team are merely automating those tests and making them pass. Using regression test automation has proven to reduce the risk of regression bugs, and cut down release times by an order of magnitude.
Let me know if you have questions about implementing regression test automation in your company. We can help you.