Automated Functional Tests

Modern software is complex. As we increase the amount of code in our product, we lose the ability to hold a complete model of our architecture in our head. Thus, every change we integrate has the potential to affect other code paths unintentionally. To minimize the risk of delivering error-prone software, we run automated functional tests.

Across our industry, we certainly have standardized terminology for categories of functional tests, but we do not have standardized definitions for these. The most popular terms include unit tests, integration tests, and end-to-end tests. However, if we ask any number of engineers to define a "unit", we can expect as many or more interpretations.

The book Software Engineering at Google by Titus Winters, Tom Manshreck, and Hyrum Wright matches these terms to a strictly defined scope and categorizes narrow-scoped tests, medium-scoped tests, and large-scoped tests. Disregarding the specificity of the term narrow-scoped, I renamed those to small-scoped in this book for the sake of simplicity and common use.

Recommendation

Engineering Collaboration
Based on their experience at Google, software engineers Titus Winters and Hyrum Wright, along with technical writer Tom Manshreck, present a candid and insightful look at how some of the world's leading practitioners construct and maintain software. Some software testing concepts in Engineering Collaboration originate from these insights.

Software Engineering at Google was (and is) the book that inspired me to write Engineering Collaboration. I highly recommend you visit abseil.io or purchase the book from O'Reilly.

The scope of these tests refers to the difficulty of writing the tests, the complexity of assembling the test dependencies, and the execution time of the tests. The larger the scope, the more resources we invest in writing and running the category of test. We execute as broad a coverage of our tests as frequently and early in the development cycle as possible without stalling developer velocity.

When testing on dedicated machinery on-prem, every minute our build and test infrastructure sits idle is a minute without feedback on potential regressions. However, every processing minute on third-party cloud runners increases our costs. Depending on our needs, we either buy more minutes, offload some tests to on-prem runners, or reduce the scope of tests and run larger tests less frequently.

To highlight the appropriate distribution of scoped test sizes within our software, Mike Cohn provides a metaphorical representation in his book Succeeding with Agile. The three-layered testing pyramid counsels us on the relative amount of automated tests for each scope. As a rule of thumb, when ascending the pyramid, every layer of the pyramid reduces the number of tests by an order of magnitude. If we start at a thousand small-scoped tests, we aim for a hundred medium-scoped tests, and no more than ten large-scoped tests.

Small-scoped unit tests build the base of the pyramid. Commonly referred to as unit tests, they ensure correct behavior within a system. The number of small-scoped tests spans about 75%-90% of the automated tests in our software and we execute these frequently. It is not uncommon for developers to run the small-scoped suite of tests every couple of minutes while working on the source code. Generally, executing all of our small-scoped tests takes seconds.

The mid-layer consists of medium-scoped tests, or integration tests. When executing these tests we verify the behavior between embedded or connected systems. The process requires us to spawn these dependent systems in order to run the tests. In the section Medium-Scoped Tests we explore how to reduce the complexity of the procedure. We expect our medium-scoped test suite to complete in minutes.

The pinnacle of the pyramid consists of automated large-scoped tests. These follow an interaction flow across the entire product, from user input to system processing back to user response. Thus, these tests are commonly known as end-to-end tests. Large-scoped tests tend to be difficult to write, face complex challenges to set up and execute, and are generally long-running operations. Automated large-scoped tests consume hours of our testing budget.

The testing pyramid promises us the most bang for our buck. If we are able to run the majority of our tests in seconds, we catch errors as quickly as possible. Any error reported by small-scoped tests takes seconds to uncover. When running against medium or large-scoped tests, fixing an error occupies us for hours if not days.

A common anti-pattern to this approach is the testing snow cone, or inverted testing pyramid. These projects contain little to no small-scoped tests with all the testing done in medium-to-large-scoped tests, combined with labor and time-intensive manual tests. Flipping the distribution of scopes results in long test runs and slow feedback cycles.

The test snow cone typically emerges from legacy software that was not written with testability in mind. Closed-off systems make it difficult to test individual steps and a lack of dependency-injection prevents us from writing light-weight in-memory tests. Besides that, a lack of ownership in infrastructure or testing workflows may invert our testing pyramid. If we find it difficult to write, update, and execute tests, we lack the motivation to do so and shift the responsibility to QA.

Over the rest of this chapter we'll explore our testing scopes in detail.

Small-Scoped Tests

The advantage of small-scoped tests lies in their low resource footprint. The limitations we enforce and the isolated nature of small focused tests make the code straightforward to write and maintain. Small-scoped tests execute swiftly and without initializing external dependencies. Our software commonly contains thousands of small-scoped tests, all collectively completed in seconds.

We limit small-scoped tests to the fastest possible testable entity in our code, in-memory operations. Small-scoped tests do not run disk operations or network operations. They do not sleep, make other blocking calls, or consume other OS processes. These factors are covered by medium-scoped tests. Writing source code and its test code with these constraints in mind enables us to verify a broad set of behavior with minimal effort.

The limitations on their complexity make small-scoped tests rapid. Hence, we run them as often as possible. Their small footprint facilitates executing them during every step of the development cycle: while coding on our local machines, on pre-merge checks, post-merge validations, and as pre-release requirements. A complete suite of passing small-scoped tests increases the confidence of introducing changes without destabilizing our code base.

Every serious programming language ships with tooling for writing and running small-scoped tests. Numerous blogs, tutorials, and guides cover setting up these tests for any language in any integrated development environment. Beyond the limitations highlighted above, we suggest following our programming language's most idiomatic practices.

Good read? Unlock the rest of the chapter!

Engineering Collaboration is currently available as an Advanced Reading Copy for select readers.

Get in touch with the author