What to Test (and What Not) for High Test Automation ROI

Article

What to Test (and What Not) for High Test Automation ROI

March 15, 2026

|

by Shrikant Vashishtha

Test Automation

Software Quality

Continuous Delivery

A team I worked with recently had a familiar problem.

They owned a BFF gateway (Backend-for-Frontend) between a mobile app and a bunch of microservices. Validations, orchestration, error mapping, retries – the “simple glue” that quietly becomes your production heartbeat.

And yet…

The codebase had zero automated tests.

So they did what most sincere teams do:

“Let’s start unit tests. One test-class per class. Mock dependencies. Target 80% coverage.”

It sounds responsible. It sounds mature.

But I pushed back.

Not because unit tests are bad.
Because this pattern often creates a test suite that creates more problems than it solves.

The trap: tests that punish refactoring

Take a simple chain:

TransactionService → ValidationService → FolioRepository

The typical “unit test every class” playbook says:

Test TransactionService by mocking ValidationService
Test ValidationService by mocking FolioRepository

Now look at what those tests usually assert:

which method got called
with which arguments
in what sequence
and sometimes even how many times

That’s not testing behavior.

That’s testing wiring.

Now refactor:

split ValidationService into smaller validators
introduce a cache
change internal method signatures

The user doesn’t care. The API behavior hasn’t changed.

But your tests break.

So you spend hours rewriting tests… not because the product broke, but because the internals moved.

That’s the moment test automation stops being a safety net and starts becoming “work you must do before you can do real work.”

A simple question to spot low-ROI tests

Whenever I review a test suite, I ask this:

If I refactor the internals without changing behavior, should the tests fail?

If they break, the tests weren’t protecting behavior.
They were protecting the current implementation.

And implementation changes constantly.

Don’t throw away unit tests. Change what “unit” means.

This is where most debates become unproductive.

Some people hear this critique and conclude: “Unit testing is waste.”

That’s not the point.

The point is: the “unit” is rarely a class.
The “unit” is usually a boundary that matters.

For a BFF, that boundary is:

the HTTP contract the mobile app consumes

For a Flutter app, a useful boundary is often:

the Cubit/Bloc behavior (event/action in → states out)

When you test boundaries, refactoring becomes easy.
When you test wiring, refactoring becomes expensive.

What to test in a BFF: the HTTP contract

Instead of testing TransactionService in isolation, test the BFF like your mobile app uses it:

start the real app
send a real HTTP request
assert on the real HTTP response
mock only external systems (downstream services)

That means your tests cover the “stuff that actually breaks”:

routing
serialization
validation flow
error mapping
configuration
security headers (if any)
“this endpoint still behaves as promised”

Mini-example (Spring Boot BFF + WireMock)

Goal: Test /transactions end-to-end, while stubbing downstream microservices.

Structure:

Start your Spring Boot app in test mode
Stub dependent service calls using WireMock
Hit the BFF endpoint using RestAssured or WebTestClient
Assert on response status + payload

java

// Pseudocode-ish (keep your actual framework choices)
@AutoConfigureWireMock(port = 0)
@SpringBootTest(webEnvironment = SpringBootTest.WebEnvironment.RANDOM_PORT)
class TransactionApiTest {

  @Test
  void createsTransaction_andMapsDownstreamErrorsCorrectly() {
    stubFor(post(urlEqualTo("/folio/validate"))
      .willReturn(okJson("{\"eligible\": true}")));

    given()
      .contentType("application/json")
      .body("{\"amount\":100}")
    .when()
      .post("/transactions")
    .then()
      .statusCode(201)
      .body("status", equalTo("SUCCESS"));
  }
}

Notice what we did not do:

no mocking internal services
no asserting which internal method was called
no test failures when you rename classes

As long as the contract stays correct, the test stays stable.

That’s ROI.

What to test in a Flutter app: Cubit boundary behavior

For mobile apps, class-level mocks can become even more brittle because UI and state management evolve quickly.

A useful test boundary here is the Cubit:

create the real Cubit
mock only the HTTP layer (or repository boundary)
trigger the action
assert on state transitions

Mini-example (Cubit test)

dart

blocTest<TransactionCubit, TransactionState>(
  'emits loading then success when API returns 201',
  build: () {
    final api = MockApi();
    when(() => api.createTransaction(any()))
      .thenAnswer((_) async => ApiResult.success());
    return TransactionCubit(api);
  },
  act: (cubit) => cubit.submit(amount: 100),
  expect: () => [
    TransactionState.loading(),
    TransactionState.success(),
  ],
);

Now you can refactor repositories, validators, mapping layers - tests still pass as long as behavior stays intact.

The code coverage trap (and what to measure instead)

Coverage measures: “which lines executed.”

It does not measure: “did we verify the behavior users care about?”

Coverage targets often lead to:

testing getters/setters
testing constructors
testing wiring

…great numbers, low protection.

A better metric (especially for leadership) is:

Scenario coverage: what % of critical business scenarios are automated at the boundary?

Example:

6 of 9 critical flows automated
2 remain manual due to external constraints
1 partially automated

That tells you what the user can rely on.

“But won’t boundary tests be slower?”

Yes — boundary tests are usually slower than tiny, mock-heavy unit tests.
But in practice, the slowdown is often manageable.

Spring helps here: it caches the test context and reuses it across tests that share the same setup. So if you have 50 boundary tests, you often pay the “boot-up” cost once, not 50 times.

For most BFF contract tests, you can also start lighter:

Use @WebMvcTest as a default when you mainly want to validate the web layer (routing, validation, serialization, error mapping, payload shape).
Use @SpringBootTest + RANDOM_PORT only when you truly need the whole application wired end-to-end.

One more practical trick: layer your pipeline.

Run fast unit tests on every push/commit.
Run boundary tests as the main gate on PR/merge.

So developers get quick feedback while coding, and the slower tests only block at the point where it matters.

But here’s the deeper trade-off:

Mock tests are fast, but brittle.
Boundary tests are slower, but durable.

And in real product work, durability usually wins.

Because the most expensive part isn’t running tests.
It’s what happens when a test suite becomes painful:

rewriting tests after refactors
ignoring failures because the suite is noisy
slowly losing trust in the tests altogether

Where classic unit tests still shine

There’s still a place for small unit tests:

pure functions
complex calculations
tricky edge-case logic
parsing/formatting rules
algorithmic code

These are places where mocks don’t dominate and behavior is stable.

The mistake is making class-level unit tests the default for everything.

Summary

“Test every class, mock dependencies, chase coverage” sounds rigorous – but often creates brittle suites that fight refactoring and modernization.

A higher-ROI approach is:

test behavior at meaningful boundaries
mock only external dependencies
measure scenario coverage, not just line coverage

January 9, 2026

You’re Losing Hours to Unrealistic Mocks - Instead Record Real Responses Once, Replay Forever with Mocktail

Stop flaky Java integration tests. Mocktail records real REST/service method responses once and replays them from disk for fast, deterministic runs—no hand-crafted mocks, more realistic scenarios, and happier CI pipelines.

by Shrikant Vashishtha

Learn more

February 11, 2026

AI-Powered Development: Beyond the Hype

With AI coding assistants reality is nuanced. Specs drift, context windows fail, and micromanagement becomes constant. Without strong architecture, testing, and senior oversight, AI creates chaos. It excels at boilerplate and review—but it doesn’t re...

by Emmanuelle Delescolle, Tiziano Perrucci

Learn more

November 23, 2025

Team-as-a-Service (TaaS): The Next Big Shift in Tech Hiring

Hiring individuals slows startups down. TaaS brings ready-performing teams from day one – faster delivery, zero ramp-up, and no hiring pain. Discover why this model is reshaping tech.

by Shrikant Vashishtha

Learn more

Article