Testing Trapezoid

Share on Facebook0Share on Google+0Tweet about this on TwitterShare on LinkedIn0

Many of us are familiar with testing pyramid – a vision of all tests arranged by their relative
number. For example, the Google test blog post
Just Say No to More End-to-End Tests
shows the following pyramid

Testing pyramid

  • small tests The base of the pyramid is formed by the small unit tests.
    There are a lot of them; they are simple to write and quick to run.
    The unit tests exercise individual pieces of code in isolation, usually trying to cover all
    code paths and edge cases.
  • medium tests The middle layer in the pyramid is occupied by tests that try to put a few
    parts together. These tests integrate a piece of code into a larger system. Some parts of
    the system are mocked; for example we might test logging feature, while mocking the database.
  • large tests At the top of the pyramid lie end to end tests (also called feature tests).
    They are hard to write, take a while to run, and can only cover some execution paths.
    I also believe e2e tests are hard to refactor; they slow down the feature development
    (the tests need to be modified).

The large tests are so hard to write, that even Google blog post above argues that very few
of them should ever be written! To me this seems to go against the common sense: the end to
end tests are exercising your system just like a user would. Fewer runs through the system
almost always means that certain bugs will slip in and affect the actual users.

On the other hand, do the unit tests effectively catch the edge cases in the production system?
Maybe. Maybe not. I would argue that the edge cases that the real system sees have more to do
with error handling in the total system, and how it responds to invalid data, rather than how
a particular small piece of code acts.

This leads to my personal opinion that the user would benefit from more end to end tests
and fewer unit tests (because they take up development time).

The common testing pyramid has wide base and “pointy” top. What if we could change the
shape of the pyramid to better align the amount of testing with the user’s goals?

What can we do to change the shape of the pyramid?

Well, we have done several things to change it.

  1. Replaced some unit tests with static type checks by going with TypeScript. There is less need
    to exercise what happens to a + b when a and b are strings if the compiler checks
    if the arguments are always going to be numbers.
  2. Set up and use crash reporting services, like Sentry / Raygun / etc. The instant crash
    reporting in production benefits the users directly. Everyone is expecting bugs and crashes,
    but many users are delighted when we find out about the problems instantly and deploy
    patches quickly.
  3. Experimented with lots of e2e testing tools for both API and browser testing.
    Luckily, there are interesting new tools for browser testing that prioritize developer
    experience, like Cypress that allows us to write e2e tests
    as easily as unit tests.

With these changes, here is our testing trapezoid

Testing trapezoid

Notice that we also squeezed the middle integration test layer to be pretty thin – with quick
and cheap immutable deploys we can just go with all e2e tests instead.

I will be happy to discuss our testing approach and philosophy, just drop me an email
or tweet at @bahmutov

Share on Facebook0Share on Google+0Tweet about this on TwitterShare on LinkedIn0