How to Test Your Dragon: Breaking Down a Team Test Strategy for Distributed Systems, Part 2
Created by: Richard Kevin Kabiling 19min read
Dec 12, 2022
Previously
In part one of the series, we established that despite automation being an effective tool that can boost software quality and team performance, unsystematically testing the system may also bring in effects that reduce its effectiveness.
We also dissected an application and its backing system down to a plausible architecture of a single system component to better understand a system under test and the different integration tiers within an architecture.
Finally, we discussed a strategy on a high level that lends direction and guidance to a possibly more detailed testing strategy.
We are now ready to move on forward and dive into the specifics!
Strategy in Detail
As previously underlined, the test pyramid provides an excellent structure for our test strategy because it effectively establishes the relationship between the component integration spectrum and test development and maintenance cost.
Aligning with the test pyramid, we will further group these tests into low-level and high-level tests.
A comparison of low-level and high-level tests
Low-Level Tests | High-Level Tests | |
---|---|---|
Test Object | parts of an application via direct code invocation | one or more integrated system components (running applications) via their interfaces and environments |
Coverage Focus | line coverage branch coverage | acceptance coverage technical acceptance coverage |
Test Types | unit tests white box integration tests. Examples: application white box integration tests, infrastructure black box integration tests | black box integration tests (or acceptance tests) distributed across test suites that focus on discrete levels of integration. Examples: system component black box integration, system end-to-end black box integration, system end-to-end with UI black box integration |
Environment and Dependencies | locally or in CI, dependencies in code are mocked unless mocking proves to be too expensive, in which case, collaborators are emulated or stubbed (i.e., infrastructure dependencies) | locally or in CI, dependencies external to the scope and environment are emulated or stubbed in a test environment, a subset of the tests may optionally be run in a production environment, an even smaller subset of the tests may optionally be run as smoke or sanity tests |
The term integration test is used very liberally in the community and can mean a lot of things:
- testing two classes
- testing a function or a method integrated with a channel
- testing a function or a method integrated with an infrastructure component
- testing a fully integrated application or component
- testing multiple fully integrated components
While these examples of integration tests are helpful, the ambiguity does not bring much to the discourse. To make it specific, I will use qualifiers to communicate better what I mean when I use the term integration test.
Please see the examples shown below from this repository.
Low-Level Tests
Low-level tests interact directly with code components that make up a single application. These code components are very low-level and could be classes, functions, and some combination of them. In the context of hexagonal architecture, this means testing the different code components found in each "layer" of the architecture. Precisely these will be the tests that the strategy recommends:
- unit tests
- white box integration tests
Because we are only testing parts of the application, these tests are generally:
- very fast
- very simple
- very cheap to write
Structurally, white box tests are characterized by either one of the following (or both).
- direct function or method invocation
- usage of mocks or fakes
Most importantly, these tests focus on line and branch coverage on all code paths written by the engineers-- emphasis on written by the engineers. This guarantees validity at the lowest levels of the system.
Core Domain
@Test
void throwsWhenAccountNotFound() {
var currency = Currency.getInstance("PHP");
var accountId = new AccountId(UUID.randomUUID().toString());
when(retrieveAccountPort.findAccount(accountId)).thenReturn(Mono.empty());
var merchant = merchant(currency, 200L);
when(retrieveMerchantPort.findMerchant(merchant.id())).thenReturn(Mono.just(merchant));
var command = new PayCommand(
accountId,
merchant.id(),
500,
currency
);
var throwable = catchThrowableOfType(() -> payUseCase.pay(command).block(), SourceAccountNotFoundException.class);
assertThat(throwable.getSourceId()).isEqualTo(accountId);
}
An example unit test that tests a pay function throws a SourceAccountNotFoundException
The core domain holds the main business logic of the application. Mainly, this means a lot of the work here involves validation, processing, and orchestration instead of integration. Therefore, if done right, it will most likely contain the bulk of the logic written by the team. Because of this, most tests in the core domain layer are unit tests, i.e., fast, isolated, portable, and explicit function-wise tests with many of the dependencies mocked, faked, or stubbed.
@SpringBootTest(classes = {
PaymentService.class,
ValidationAutoConfiguration.class
})
class PayUseCaseTest {
...
@ParameterizedTest
@MethodSource
void throwsWhenInvalidParameters(PayCommand command, String messageSegment) {
var throwable = catchThrowableOfType(() -> payUseCase.pay(command), ConstraintViolationException.class);
assertThat(throwable).hasMessageContaining(messageSegment);
}
static Stream<Arguments> throwsWhenInvalidParameters() {
var sourceId = new AccountId(UUID.randomUUID().toString());
var merchantId = new MerchantId(UUID.randomUUID().toString());
var amount = 0;
var currency = Currency.getInstance("PHP");
return Stream.of(
of(null, "pay.command: must not be null"),
of(new PayCommand(null, merchantId, amount, currency), "pay.command.sourceId: must not be null"),
of(new PayCommand(sourceId, null, amount, currency), "pay.command.merchantId: must not be null"),
of(new PayCommand(sourceId, merchantId, -1L, currency), "pay.command.amount: must be greater than or equal to 0"),
of(new PayCommand(sourceId, merchantId, amount, null), "pay.command.currency: must not be null"),
of(new PayCommand(new AccountId(null), merchantId, amount, currency), "pay.command.sourceId.value: must not be empty"),
of(new PayCommand(new AccountId(""), merchantId, amount, null), "pay.command.sourceId.value: must not be empty"),
of(new PayCommand(sourceId, new MerchantId(null), amount, null), "pay.command.merchantId.value: must not be empty"),
of(new PayCommand(sourceId, new MerchantId(""), amount, null), "pay.command.merchantId.value: must not be empty")
);
}
An example white box integration test that loads validation configuration and tests validation annotations using parameterized tests
In the case that there is functionality that we can't explicitly test using unit tests, such as in the case of annotation-based validation in Java and Spring (JSR-303), we may write additional white box integration tests, i.e., white box tests (as we defined) that also bring in other active components and configurations into the mix during testing.
These white box integration tests tend to blur the definition of integration and unit tests. Thus in some schools of thought, they are still considered unit tests. To be explicit, I have chosen to stick with the term integration test.
We may also introduce additional unit tests to bolster coverage for utilities, strategies, and similar constructs.
Infrastructure Adapters
static {
TestEnvironment.start();
}
@Test
void retrievesWhenExists() {
var currency = Currency.getInstance("PHP");
var id = new AccountId(UUID.randomUUID().toString());
var record = new AccountRecord(id.value(), 2000L, currency.getCurrencyCode(), currency.getDefaultFractionDigits(), null);
repository.save(record).block();
var result = port.findAccount(id).block();
var expected = new Account(id, 2000L, currency, 0L);
assertThat(result).usingRecursiveComparison()
.isEqualTo(expected);
}
An infrastructure white box integration test that connects directly with a database (using docker)
The infrastructure layer of the hexagonal architecture focuses on integration with infrastructure components and other services like databases, queues, and 3rd party APIs. Typically this involves the adapter converting domain objects into requests and then dispatching them to SDKs to perform its tasks.
The preferred tests for the infrastructure layer are infrastructure white box integration tests. In particular, SDKs are notorious for being very difficult to mock (case in point: AWS SDK), and mocking, to this degree, is tedious, challenging to follow, and unreadable.
Let's break this down. In the implementation of infrastructure white box integration tests, we prefer the following:
- The methods of the adapter are directly invoked; for simplicity, tests may be limited to the primary adapter interface and should already cover most of the orchestration – if any.
- the infrastructure layer should connect to a local emulation of the system to keep things portable and predictable
- the return values from the function invocation are verified
- the infrastructure emulation state is verified
See the following examples.
Test Object | Test |
---|---|
UserRepositoryAdapter.save(...) invokes multiple repository instances that map directly to a PostgreSQL table (UserRepository , AddressRepository , ContactDetailsRepository , FriendshipRepository ) to save a User. | The test starts up an embedded HSQL database with PostgreSQL emulation (or a PostgreSQL container in docker compose). The test invokes the save method directly to save a user.The test queries PostgreSQL directly via some existing infrastructure code or framework to verify that the User is saved.The test cleans up the environment. |
ForeignExchangeServiceAdapter.list invokes a 3rd party API to list current foreign exchange rates. | The test starts an embedded Wiremock instance (or a Wiremock container in docker compose) and stubs the return values. The test invokes the list method directly to retrieve rates.The test validates that the returned list is as expected. The test cleans up the environment. |
FileStorageAdapter.upload uploads a file to AWS S3. | The test starts up a localstack instance in docker compose and creates the appropriate buckets. The test invokes the upload method directly. The test validates that the file was indeed uploaded to localstack S3. The test cleans up the environment. |
A table illustrating examples of white box integration tests for the infrastructure adapters
Spinning up large infrastructure components during the test may take time. To maintain a low execution duration, the tests should spin up the environments carefully and cache them across the test lifecycle.
Similar to the core domain, we may introduce additional unit tests to bolster coverage for utilities, strategies, and similar constructs. Furthermore, while not recommended, we may add more unit tests to the adapter and its dependencies if coverage is insufficient.
Application Adapters
@Test
void returnsErrorOnSourceNotFound() {
var sourceId = new AccountId(UUID.randomUUID().toString());
var merchantId = new MerchantId(UUID.randomUUID().toString());
var amount = 500L;
var currency = Currency.getInstance("PHP");
var command = new PayCommand(sourceId, merchantId, amount, currency);
given(payUseCase.pay(command)).willThrow(new SourceAccountNotFoundException(sourceId));
var request = new PaymentRequest(sourceId.value(), merchantId.value(), new BigDecimal("5.00"), currency);
var result = webClient.post()
.uri("/payments")
.body(Mono.just(request), PaymentRequest.class)
.exchange()
.expectStatus().isEqualTo(422)
.returnResult(ErrorResponse.class)
.getResponseBody()
.blockFirst();
assertThat(result.code()).isEqualTo("SourceAccountNotFoundException");
assertThat(result.message()).isEqualTo("Source account not found");
assertThat((result.details().get("sourceId"))).isEqualTo(sourceId.value());
}
An example application integration test that tests the POST /payments
endpoint and uses a mocked use case
The application layer of the hexagonal architecture focuses on getting input from various channels and dispatching them as commands to the core domain layer for processing. The abstraction of an input adapter also sometimes involves middlewares, handlers, decorators, or filters apart from the adapter itself. These structures augment the adapter's behavior to add common concerns like parsing, security, logging, validation, error handling, etc.
The preferred tests for the application layers are application white box integration tests. These tests go through the integrating channel to test the functionality offered by the input adapters and all the configured middleware. Because the domain is not the prime concern here, we prefer to mock its use cases.
More specifically, we very strongly recommend the following when implementing application white box integration tests:
- The adapter is invoked via the channel it integrates to assure that relevant middlewares and handlers are invoked. For simplicity, the tests may also cover the dependencies of the adapter.
- The core domain components that the application adapter drives are mocked
- The return values from the channel are verified
- The state of the mock is verified
See the following examples.
Test Object | Test |
---|---|
A POST /users HTTP endpoint invokes SaveUserUseCase.save(…) to save users | The SaveUserUseCase is mocked.The HTTP server hosting the endpoint is started. The test invokes the POST /users endpoint via HTTP.The test validates the HTTP response. The test validates the mock state. |
A PaymentEventListener listens to payment events from the payment-events topic and invokes SummarizePaymentStatsUseCase.summarize to compute summary statistics of payments | SummarizePaymentStatsUseCase is mocked.The Kafka server is started. The Kafka PaymentEventListener is started.The test dispatches a payment event to the payment-events topic.The test validates the mock state. |
A table illustrating examples of white box integration tests for the application adapters
Spinning up listeners, servers, and infrastructure components during the test may take time. To maintain a low execution duration of these tests, the spinning up of these listeners, servers, and components should be carefully planned and cached across the test lifecycle.
Similar to the core domain, we may introduce additional unit tests to bolster coverage for utilities, strategies, and similar constructs. Furthermore, while not recommended, more unit tests may be added to the adapter and its dependencies in case coverage are insufficient.
Utilities, Strategies, and Similar Constructs
@ParameterizedTest
@MethodSource("source")
void areConvertedFromBigDecimalToLong(Currency currency, long amountInLong, BigDecimal amountInBigDecimal) {
assertThat(Amounts.fromAdjustedAmount(currency, amountInLong)).isEqualTo(amountInBigDecimal);
}
@ParameterizedTest
@MethodSource("source")
void areConvertedFromLongToBigDecimal(Currency currency, long amountInLong, BigDecimal amountInBigDecimal) {
assertThat(Amounts.toAdjustedAmount(currency, amountInBigDecimal)).isEqualTo(amountInLong);
}
public static Stream<Arguments> source() {
return Stream.of(
of(Currency.getInstance("PHP"), 4500, new BigDecimal("45.00")),
of(Currency.getInstance("JPY"), 45, new BigDecimal("45")),
of(Currency.getInstance("JOD"), 45000, new BigDecimal("45.000")),
of(Currency.getInstance("PHP"), 500, new BigDecimal("5.00")),
of(Currency.getInstance("JPY"), 500, new BigDecimal("500")),
of(Currency.getInstance("JOD"), 500, new BigDecimal("0.500"))
);
}
Example parameterized unit tests for a utility function used for converting from BigDecimal
to Long
based on the currency.
As previously mentioned, some functionalities are abstracted within the application into utility functions, strategies, or similar constructs. The general recommendation for these is to unit test them accordingly.
Low-Level Tests in Summary
Low-level tests are white box tests that focus on code. More specifically, they should:
- should test different parts of the application based on the architecture (core domain layer, application layer, infrastructure layer)
- should test code that integrates with mocks, emulators, and stubs
- should focus on line and branch coverage, not necessarily on high-level requirements
These are the low-level tests we discussed.
Unit | Application White Box Integration | Infrastructure White Box Integration | |
---|---|---|---|
Application Section | core domain utilities, strategies, and similar components | application adapters | infrastructure adapters |
Environment and Dependencies | dependencies are mocked | core domain use cases are mocked | environment is emulated |
Test Specifics | functions are invoked directly validated against return value and mock states | invoked via channel or protocol validated against channel response and mock states | functions are invoked directly validated against return value and emulated environment state |
High-Level Tests
High-level tests are all black box integration tests that interact directly with one or more real running applications, their exposed interfaces, and their environments – in contrast with low-level tests that directly interact with code.
Characteristics
Because we are dealing with one or more applications at a time, these high-level black box integration tests generally:
- have larger environments
- take more resources
- take more time to execute
- require more orchestration
As previously established, because of these reasons, these tests cost more and thus should trend to a lesser number.
Structurally, these high-level black box integration tests are characterized by the following:
- interacts with the application via its interfaces and its environments
- spins up real running applications for the objects under test and their emulated environments.
Despite this, it is of utmost importance to keep the tests completely runnable locally and consequently in CI to keep the test portable, consistent and reliable.
While it is mandated that high-level tests are runnable locally and CI, they may be engineered so that a subset may be run against a test environment and an even smaller subset against production as smoke tests or sanity tests. In summary:
- tests should be runnable locally
- tests should be runnable in a CI environment
- a subset of tests may be configured to run against a test environment
- an even smaller subset of tests may be configured to run against production as smoke or sanity tests
Multiple Suites, Multiple Integration Levels
We can run black box integration tests against various integration levels. To keep things organized and efficient, primarily when spinning up dependencies, it is to our benefit to separate these tests across suites that focus specifically on just one integration level.
Typically tests on lower integration levels are more straightforward. However, these tests may not be able to cover a whole journey or hit more components (possibly miss the UI too). But what they lack in terms of range, they make up with simplicity and depth of scope, meaning these tests would be able to cover fewer functionalities more deeply via more straightforward tests in more significant numbers.
These are the various levels of integration that we will focus on.
System Component | System End-to-End | System End-to-End with UI | |
---|---|---|---|
Integration Level | A single running back-end application | One or more running back-end applications | One or more running back-end applications, including the UI |
Scenarios | consumption of propagated state propagation of state to outside the application typically sections of a user journey technical requirements of a single service | state propagation across applications user journeys that cut across multiple applications technical requirements that require orchestration of various services | state propagation across applications user journeys from the UI technical requirements that need orchestration of multiple services |
Environment and Dependencies | All collaborators of the application are emulated or stubbed | All 3rd party collaborators of the system are mocked | All 3rd party collaborators of the system are mocked |
Typical Procedure | The environment is spun up The application is spun up The environment is configured The application receives a request and possibly returns a response The test validates against the response The test validates against the application state The test validates against the environment | The environment is spun up The application is spun up The environment is configured The applications are orchestrated The test validates against the responses The test validates against the application state of multiple applications The test validates against the environment | The environment is spun up The application is spun up The environment is configured The applications are orchestrated The UI is automated The test validates against the UI state The test validates against the application state of multiple applications The test validates against the environment |
More Granular Slices, Maybe?
Depending on the size of the system under test and the team that manages and maintains it, it may be appropriate to have test suites for more or less discrete levels of integration.
The strategy does not mandate or detract against it and leaves this decision to the maintenance team. More specifically, these slices of the system may be tested in a separate test suite. The boundaries of the slice are emulated or stubbed.
Very Important, Coverage Focus
More importantly, in contrast with low-level tests that focus specifically on code coverage, high-level tests, on the other hand, concentrate heavily on acceptance coverage. This means that the scenarios these tests implement disregard low-level implementation details and only focus on requirements from the users' perspective that use the system or team practice and governing bodies within the organization or the industry. Particularly:
- Business acceptance requirements - these include the functional requirements that most users of the system are concerned with.
- Technical acceptance requirements - these include non-functional requirements that most governing bodies are concerned with. These include handling resilience, security, etc. These requirements may teeter close to non-functional use cases. Still, it is up to the discretion of the team to determine what's appropriate and what should be separated in another test suite (e.g., load testing and performance will most likely live in a nun-functional performance test suite; live reliability and availability checks will most likely live in a chaos engineering test suite or script).
We can map these requirements directly to one or more scenarios. We then implement these scenarios and include them in one of the test suites depending on the required level of integration.
Again, Not Line or Branch Coverage
Although these tests still yield line and branch coverage, this is no longer of concern at this level since the expectation is that low-level tests have already covered them perfectly fine. Especially at this level, winding code paths might make acceptance tests unwieldy and unreadable.
Because of the shift in focus, coverage will naturally overlap, whether in acceptance coverage or line/branch coverage. Because of this, it might be essential to treat these coverage foci independently and make sure that:
- low-level tests focus on coverage
- high-level tests alone focus on acceptance requirements
This mindset prevents discussions from devolving into unproductive arguments about scenarios or coverage overlaps.
A Huge Note on Tooling
Feature: Payment
Scenario: Successful Payment
Given the following accounts exist:
| id | currency | balance |
| A | PHP | 500.00 |
| B | PHP | 250.00 |
And the following merchants exist:
| id | account id |
| X | B |
When the client pays PHP 50.00 using account "A" to merchant "X"
Then the payment is accepted
And the payment is saved
And the payment has 2 transaction entries
And the payment has a DEBIT transaction entry of PHP 50.00 on account "A"
And the payment has a CREDIT transaction entry of PHP 50.00 on account "B"
And the account "A" balance is PHP 450.00
And the account "B" balance is PHP 300.00
Scenario: Failed Payment due to Non-existent Account
Given the following accounts exist:
| id | currency | balance |
| A | PHP | 500.00 |
| B | PHP | 250.00 |
And the following merchants exist:
| id | account id |
| X | B |
When the client pays PHP 50.00 using account "C" to merchant "X"
Then the payment is unprocessable
And the payment error code is "SourceAccountNotFoundException"
And the payment error message is "Source account not found"
Scenario: Failed Payment due to Non-existent Merchant
Given the following accounts exist:
| id | currency | balance |
| A | PHP | 500.00 |
| B | PHP | 250.00 |
And the following merchants exist:
| id | account id |
| X | B |
When the client pays PHP 50.00 using account "A" to merchant "Y"
Then the payment is unprocessable
And the payment error code is "MerchantNotFoundException"
And the payment error message is "Merchant not found"
An example application acceptance tests for a simple payment feature that uses Gherkin
Because high-level black box integration tests are acceptance tests, usage of acceptance test frameworks like Cucumber, which uses Gherkin, or Gauge, which uses Markdown, is recommended.
These frameworks and how they enforce the writing of tests facilitate detaching of lower technical details and allow even non-engineers to get a good view of the acceptance requirements and scenarios. However, this is just a strong recommendation and not mandated by the strategy.
High-Level Tests in Summary
High-level black box integration tests focus on business and technical acceptance. More specifically, they should:
- should be separated into multiple test suites that focus on different integration levels of the system (application, end-to-end, etc.)
- should test real running applications that collaborate with emulators and stubs
- should focus on business and technical acceptance coverage and not on line and branch coverage
What's Next?
Even when prioritizing implementing this test strategy, we will always have to refer back to the test pyramid and its recommendations.
Because low-level tests are the foundation of the strategy, making sure that they are numerous and well crafted is of utmost importance. Only then should we start concerning ourselves with tests focusing on more integrated views of the system. Meaning we have to prioritize in this order:
- unit tests
- white box integration tests (application, infrastructure)
- system component (just 1)
- end-to-end
- end-to-end with UI
As familiarity betters and maturity increases, we can now focus on strengthening the test strategy by possibly introducing more test suites that focus either on non-functional requirements (performance, load testing) or on other levels of integration (system slices).
Summary
The test pyramid establishes the relationship between the component integration spectrum and cost (development and maintenance). Because of this, it provides an excellent guide for our test strategy.
In detail, the strategy dissects a system in varying levels of integration and suggests corresponding tests for each of them. We have defined two main groups of tests, low-level tests that focus on line and branch coverage and high-level tests that focus on requirements and acceptance coverage. In more specific detail, these are the tests previously discussed.
Test | Type | Object under Test | Test Structure |
---|---|---|---|
Unit | Low-Level | core domain classes and functions utilities, strategies, and similar structures in the code | test code directly mocks collaborators validates against returned value & mock state |
Infrastructure White Box Integration | Low-Level | infrastructure adapters | test code directly uses emulated environment component or embedded alternative validates against returned value & environment state |
Application White Box Integration | Low-Level | application adapters | tests through the channel (HTTP, GRPC, topic, queue, etc.) mocks core domain use cases uses channel infrastructure (Kafka, etc.) validates against returned response from the channel and mock state |
System Component Black Box Integration | High-Level | a single system component | tests via system component interfaces spins-up emulated environment components and stubs collaborators validates against returned response from the channel, environment state, application state |
Slice Black Box Integration Test | High-Level | a methodical slice of the system | tests via system component interfaces spins-up emulated environment components and stubs collaborators around the slice boundaries validates against returned response from the channel, environment state, application state |
End-to-End Black Box Integration | High-Level | back-end system | tests via system component interfaces spins-up emulated environment components and stubs for 3rd party collaborators validates against returned response from the channel, environment state, application state |
End-to-End with UI Black Box Integration | High-Level | system with UI | tests via the user interface spins-up emulated environment components and stubs for 3rd party collaborators validates against user interface state, against environment, against application state |
This strategy should be a good enough jump-off point to more structurally and strategically test a large distributed system in its different levels. As a team's automation practice matures, the team may further refine this strategy by incorporating non-functional test suites or introducing test suites that focus on different slices of the system to better align with user needs.
Hopefully, this long discussion on test strategies could give a more structured understanding of testing and lessen the tendency to haphazardly test for testing's sake, especially in the face of dragons.
"Fairy tales are more than true: not because they tell us that dragons exist, but because they tell us that dragons can be beaten."
– Neil Gaiman, Coraline
Thanks to Nyker Matthew King for having a second read on this write-up. If anyone has questions, please feel free to reach out to me via [email protected].
Contributors
Learn more about Maya!
- Head back to our Maya Tech Blog for more interesting articles
- Keep up with the latest stories of innovation from Maya Stories
- or Check us out in LinkedIn.