Tests are an indicator of codes quality

Intro

Whenever you want to assert the quality of a codebase you should just look into its test suite.

Each project falls under 1 of the following cases:

  1. There are not any tests
  2. There are some tests
  3. There are a lot of tests

If its the first case there is not much to be said. Every change we make is a potential threat to the application’s stability as it can result into breakage. In that case, our only option is to start adding tests.

We will focus on the other 2 cases where we will see why tests are important, why many tests do not necessarily indicate a well tested system and what should be paid attention to when we write tests.

Before we start i need to define what i personally consider as the purpose of tests and what makes them important. As their purpose i consider the following:

  1. Verifying that the implementation works
  2. Setting up guard rails to alert us when something breaks
  3. Making sure that our code is maintainable by allowing us to test it thoroughly

Quantity

We will start by looking into the quantity property of a test suite.

Judging a codebase by the amount of tests it contains can easily lead to an illusion that the codebase is well tested. This is more often false rather than true.

Lets assume that we open a project and run gradle test and gradle reports that it executed 100 tests and all of them passed. On a first sight we can assume that many parts of the code have been tested. What happens though if we open the tests and see the following?

class Person(val name: String)
//
@Test
fun testGetName() {
    assertEquals(Person("Joe").name, "Joe")
}

We can see that despite the amount of tests, they do not provide any value as the parts that are being tested are trivial. Unfortunately this is pretty common in business applications, as there are mandates to increase the coverage which provide an illusion of well tested codebases.

Another example of this is when we have a function that uses multiple functions and we write tests for the inner functions instead of testing the parent function itself. In short, we test the implementation instead of the behaviour which we will discuss in mocking.

fun calculatePrice(cart: Cart) = calculateShipping(cart) + calculateTax(cart) - calculateDiscount(cart)

In this case, we need to calculate the total price. Instead of writing 4 tests:

  1. testCalculateShipping
  2. testCalculateTax
  3. testCalculateDiscount
  4. testCalculatePrice

We only need to write 1 test for testCalculatePrice. The other 3 tests are implicitly tested and writing tests for each case individually does not provide any value but only increases the code we need to maintain, as tests are also code that needs maintenance.

In the case where we need to show the tax or discount for example, the individual tests are not only useful, but also required.

Slowness

Another indicator of a poorly tested codebase and thus lower quality code is when the tests are slow. This is not only a problem for the quality of the tests but also for the productivity of the developers.

When tests are slow, it carries the risk of developers not writing tests in order to avoid accumulating more slowness.

Slowness can be caused by multiple factors, some of them can be:

  1. A lot of dependencies are needed to test a specific part of the code
  2. The code under test is not isolated which requires us testing a multiple unnecessary parts also

As an example lets look again at cost calculation method:

class Calculator(private val database: Database) {
    fun calculateCost(distinationCountry: DistinationCountry): Long {
        val products = database.fetchProducts()
        val discount = calculateDiscount(products)
        val shipping = calculateShipping(products, distinationCountry)
        val tax = calculateTax(products)
        val price = calculatePrice(products) 
        
        return (price + tax + shippping) - discount
    }

    private fun calculateTax(products: List<Product>) { 
      // ...
    }

    private fun calculateDiscount(products: List<Product>) { 
      // ...
    }
}

It is very common for developers to use the @SpringBootTest annotation and write the following test:

@SpringBootTest
class CalculatorTest {
    @Autowired
    lateinit var calculator: Calculator


    @Test
    fun testCalculateCost() {
        // calcuator.calculateCost(...)
    }
}

You can imagine how time consuming and unnecessary that can be, since we only need the Calculator class and some more infrastructure code to set up the Database.

Instead of using the context, we can manually construct the class we need and reduce the execution time of the test.

An example of this would be:

class CalculatorTest {
    val calculator: Calculator = Calculator(setupDatabase())

    @Test
    fun testCalculateCost() {
        // calcuator.calculateCost(...)
    }
}

Continuing with the example above, if we want to test the calculateTax method we cannot as it is private. Our only way to do that is doing it through the calculateCost method.

While it contradicts what was mentioned in the previous section, we can see that depending on our goal we need to follow a different approach and there is not a single rule for every case.

Assuming that in this case we need to be able to verify that calculateTax works correctly by using multiple scenarios, we are forced to write multiple scenarios for the calculateCost function, which requires setting up the database.

Instead, we can make the function a top level function and write unit tests for it.

// calculators.kt
fun calculateTax(products: List<Product>) { 
  // ...
}

and test it using:

class CalculatorTest {
    @Test
    fun calculateTaxScenario1() {
        val products = scenario1Products()
        // assertEquals(calculateTax(products), ..)
    }

    @Test
    fun calculateTaxScenario2() {
        val products = scenario2Products()
        // assertEquals(calculateTax(products), ..)
    }

    @Test
    fun calculateTaxScenario1() {
        val products = scenario3Products()
        // assertEquals(calculateTax(products), ..)
    }
}

This allows us to get rid of the database requirement and write unit tests that are much faster and allow more flexibility than integration tests.

Effort

A factor that is related to slowness, is the effort needed to add a new test to an existing test suite.

Think about the following case:

We have discovered that our implementation has a bug. We adjust the code and now we need to to verify that it will continue to work by adding a new test. How easy is to do so?

If we need us to bring multiple external dependencies (e.g database, queues, caches) all at the same time, it means that our implementation has more responsibilities than it is supposed to have. This is a good moment to take a step back and refactor the code.

Doing so allows us to have a more maintainable codebase, as we can easily spot parts that are complex even before writing a test for them.

Hard to set up test cases are almost always an indicator that our code is complex.

Mocking

In addition to the previous topics, one very important - if not the most important, is the fragility of the tests. Fragility in this case means how often the tests break after a code change.

It is tiresome having to maintain fragile tests that constantly go red, even after changing parts that are unrelated.

Most of the times tests are fragile because of mocking thus it should be used as a last resort.

The reason for this is that the nature of mocks is to force us into writing tests that do not test the behaviour of a code section, but its internal implementation. This leads to the problem where as soon as we want to refactor an internal part that is not exposed through the function’s signature, the tests we wrote using mocks will break, even if the function’s behaviour did not change for the callers of the function.

Lets look at the following example:

class Calculator(private val foreignExchange: ForeignExchangeClient) {
    fun calculateTotal(cart: Cart) {
        val total = cart.products.sumOf { it.price }
        return total * foreignExchange.getRate(cart.currency)
    }
}

We have a function which given a cart, it calcultes its price and does a currency conversion using the cart’s price.

We then proceed to write a test for it:

class CalculatorTest() {
    fun testCalculateTotal(cart: Cart) {
        val foreignExchange: ForeignExchangeClient = mockk()
        val calculator = Calculator(foreignExchange)
        val cart = setupCart()
        coEvery { foreignExchange.getRate(cart.currency) }.returns(1.2)
        assertEquals(calculator.calculateTotal(cart), ..)
    }
}

We mock the getRate in order to be able to return the rate, so far so good. Then we get a new requirement which indicates that:

calculateTotal should not apply any foreign exchange if the currency is euro

Our test now will break because our mock is not sufficient anymore.

The solution for this is to use fakes, instead of mocks.

An example of this would be:

class CalculatorTest {
    fun testCalculateTotal(cart: Cart) {
        val foreignExchange = object : ForeignExchangeClient  {
            override fun getRate(currency: String) {
                // implementation
            }
        }
        val calculator = Calculator(foreignExchange)
        val cart = setupCart()

        assertEquals(calculator.calculateTotal(cart), ..)
    }
}

We can add a new test that covers that case without breaking the previous tests.

class CalculatorTest {
    fun testCalculateTotalWithEuroCurrency(cart: Cart) {
        val foreignExchange = object : ForeignExchangeClient  {
            override fun getRate(currency: String) = if (currency == "EUR") 1 else { 
                // ... 
            }
        }
        val calculator = Calculator(foreignExchange)
        val cart = setupCart()

        assertEquals(calculator.calculateTotal(cart), ..)
    }
}

An additional example that highlights the benefit of fakes, is Ktor’s ability to use fake responses when we want to work with external apis that we cannot call in our tests.

Ktor Client allows us to fake a http response using the a Mock engine.

Using this we can return multiple responses, specific headers and status codes etc which provides a test that is very close to the real implementation.

The benefit of this over mocking is that instead of mocking entire implementations, we have a a test that uses real implementations which fake only the http call.

This way we can test parsing of json bodies, cookies, request parameters, handling http status codes and headers and so on.