Article

What to Test (and What Not) for High Test Automation ROI

Mechanism
March 15, 2026
|
by Shrikant Vashishtha
Test Automation
Software Quality
Continuous Delivery
Mechanism

A team I worked with recently had a familiar problem.

They owned a BFF gateway (Backend-for-Frontend) between a mobile app and a bunch of microservices. Validations, orchestration, error mapping, retries – the “simple glue” that quietly becomes your production heartbeat.

And yet…

The codebase had zero automated tests.

So they did what most sincere teams do:

“Let’s start unit tests. One test-class per class. Mock dependencies. Target 80% coverage.”

It sounds responsible. It sounds mature.

But I pushed back.

Not because unit tests are bad.
Because this pattern often creates a test suite that creates more problems than it solves.

The trap: tests that punish refactoring

Take a simple chain:

TransactionService → ValidationService → FolioRepository

The typical “unit test every class” playbook says:

  • Test TransactionService by mocking ValidationService

  • Test ValidationService by mocking FolioRepository

Now look at what those tests usually assert:

  • which method got called

  • with which arguments

  • in what sequence

  • and sometimes even how many times

That’s not testing behavior.

That’s testing wiring.

Now refactor:

  • split ValidationService into smaller validators

  • introduce a cache

  • change internal method signatures

The user doesn’t care. The API behavior hasn’t changed.

But your tests break.

So you spend hours rewriting tests… not because the product broke, but because the internals moved.

That’s the moment test automation stops being a safety net and starts becoming “work you must do before you can do real work.”

A simple question to spot low-ROI tests

Whenever I review a test suite, I ask this:

If I refactor the internals without changing behavior, should the tests fail?

If they break, the tests weren’t protecting behavior.
They were protecting the current implementation.

And implementation changes constantly.

Don’t throw away unit tests. Change what “unit” means.

This is where most debates become unproductive.

Some people hear this critique and conclude: “Unit testing is waste.”

That’s not the point.

The point is: the “unit” is rarely a class.
The “unit” is usually a boundary that matters.

For a BFF, that boundary is:

  • the HTTP contract the mobile app consumes

For a Flutter app, a useful boundary is often:

  • the Cubit/Bloc behavior (event/action in → states out)

When you test boundaries, refactoring becomes easy.
When you test wiring, refactoring becomes expensive.

What to test in a BFF: the HTTP contract

Instead of testing TransactionService in isolation, test the BFF like your mobile app uses it:

  1. start the real app

  2. send a real HTTP request

  3. assert on the real HTTP response

  4. mock only external systems (downstream services)

That means your tests cover the “stuff that actually breaks”:

  • routing

  • serialization

  • validation flow

  • error mapping

  • configuration

  • security headers (if any)

  • “this endpoint still behaves as promised”

Mini-example (Spring Boot BFF + WireMock)

Goal: Test /transactions end-to-end, while stubbing downstream microservices.

Structure:

  • Start your Spring Boot app in test mode

  • Stub dependent service calls using WireMock

  • Hit the BFF endpoint using RestAssured or WebTestClient

  • Assert on response status + payload

java
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 // Pseudocode-ish (keep your actual framework choices) @AutoConfigureWireMock(port = 0) @SpringBootTest(webEnvironment = SpringBootTest.WebEnvironment.RANDOM_PORT) class TransactionApiTest { @Test void createsTransaction_andMapsDownstreamErrorsCorrectly() { stubFor(post(urlEqualTo("/folio/validate")) .willReturn(okJson("{\"eligible\": true}"))); given() .contentType("application/json") .body("{\"amount\":100}") .when() .post("/transactions") .then() .statusCode(201) .body("status", equalTo("SUCCESS")); } }

Notice what we did not do:

  • no mocking internal services

  • no asserting which internal method was called

  • no test failures when you rename classes

As long as the contract stays correct, the test stays stable.

That’s ROI.

What to test in a Flutter app: Cubit boundary behavior

For mobile apps, class-level mocks can become even more brittle because UI and state management evolve quickly.

A useful test boundary here is the Cubit:

  • create the real Cubit

  • mock only the HTTP layer (or repository boundary)

  • trigger the action

  • assert on state transitions

Mini-example (Cubit test)

dart
1 2 3 4 5 6 7 8 9 10 11 12 13 14 blocTest<TransactionCubit, TransactionState>( 'emits loading then success when API returns 201', build: () { final api = MockApi(); when(() => api.createTransaction(any())) .thenAnswer((_) async => ApiResult.success()); return TransactionCubit(api); }, act: (cubit) => cubit.submit(amount: 100), expect: () => [ TransactionState.loading(), TransactionState.success(), ], );

Now you can refactor repositories, validators, mapping layers - tests still pass as long as behavior stays intact.

The code coverage trap (and what to measure instead)

Coverage measures: “which lines executed.”

It does not measure: “did we verify the behavior users care about?”

Coverage targets often lead to:

  • testing getters/setters

  • testing constructors

  • testing wiring

…great numbers, low protection.

A better metric (especially for leadership) is:

Scenario coverage: what % of critical business scenarios are automated at the boundary?

Example:

  • 6 of 9 critical flows automated

  • 2 remain manual due to external constraints

  • 1 partially automated

That tells you what the user can rely on.

“But won’t boundary tests be slower?”

Yes — boundary tests are usually slower than tiny, mock-heavy unit tests.
But in practice, the slowdown is often manageable.

Spring helps here: it caches the test context and reuses it across tests that share the same setup. So if you have 50 boundary tests, you often pay the “boot-up” cost once, not 50 times.

For most BFF contract tests, you can also start lighter:

  • Use @WebMvcTest as a default when you mainly want to validate the web layer (routing, validation, serialization, error mapping, payload shape).

  • Use @SpringBootTest + RANDOM_PORT only when you truly need the whole application wired end-to-end.

One more practical trick: layer your pipeline.

  • Run fast unit tests on every push/commit.

  • Run boundary tests as the main gate on PR/merge.

So developers get quick feedback while coding, and the slower tests only block at the point where it matters.

But here’s the deeper trade-off:

  • Mock tests are fast, but brittle.

  • Boundary tests are slower, but durable.

And in real product work, durability usually wins.

Because the most expensive part isn’t running tests.
It’s what happens when a test suite becomes painful:

  • rewriting tests after refactors

  • ignoring failures because the suite is noisy

  • slowly losing trust in the tests altogether

Where classic unit tests still shine

There’s still a place for small unit tests:

  • pure functions

  • complex calculations

  • tricky edge-case logic

  • parsing/formatting rules

  • algorithmic code

These are places where mocks don’t dominate and behavior is stable.

The mistake is making class-level unit tests the default for everything.

Summary

“Test every class, mock dependencies, chase coverage” sounds rigorous – but often creates brittle suites that fight refactoring and modernization.

A higher-ROI approach is:

  • test behavior at meaningful boundaries

  • mock only external dependencies

  • measure scenario coverage, not just line coverage

Mocktail
January 9, 2026
You’re Losing Hours to Unrealistic Mocks - Instead Record Real Responses Once, Replay Forever with Mocktail

Stop flaky Java integration tests. Mocktail records real REST/service method responses once and replays them from disk for fast, deterministic runs—no hand-crafted mocks, more realistic scenarios, and happier CI pipelines.

by Shrikant Vashishtha

AI Powered Development: Beyond the Hype
February 11, 2026
AI-Powered Development: Beyond the Hype

With AI coding assistants reality is nuanced. Specs drift, context windows fail, and micromanagement becomes constant. Without strong architecture, testing, and senior oversight, AI creates chaos. It excels at boilerplate and review—but it doesn’t re...

by Emmanuelle Delescolle, Tiziano Perrucci

TaaS
November 23, 2025
Team-as-a-Service (TaaS): The Next Big Shift in Tech Hiring

Hiring individuals slows startups down. TaaS brings ready-performing teams from day one – faster delivery, zero ramp-up, and no hiring pain. Discover why this model is reshaping tech.

by Shrikant Vashishtha