Why I Insist on Hand-written Mocks

Image by: Teresia Tarlinder

In a developer testing training, I turn the participants’ first encounter with mock objects and stubs into an exercise in developing hand-written test doubles. Then, if time permits, we reiterate using a mocking framework. In this post, I’ll explain why.

Terminology plays a central role in developer testing. By being precise in our wording and labeling, we become accurate when it comes to selecting a technique. Think of it like:

  • Dragon – Bring a shield and sword
  • Vampire – Bring garlic, holy water, and a wooden stake
  • Werewolf –  Bring silver projectiles

If we know exactly what something is, the chances are greater that we bring the right gear. Having browsed through numerous codebases, I can say that few concepts are as abused as the “mock”. Here are some of the interpretations that I can recall having seen throughout the years:

  • Stub
  • Organizer class/entity
  • System that isn’t quite ready for production
  • Database with test data
  • Stub or fake that replaces a system or component

To explain the purpose of the mock object, I get the participants to implement it roughly like this:

public class HandwrittenMock implements CollaboratorOperations {

    private boolean wasCalled;
    private int parameterValue;

    public void doSomething(int parameter, String dummy) {
        wasCalled = true;
        if (parameter == 42) {
            parameterValue = parameter;

    public void verify() {
        assertTrue("doSomething() must be called", wasCalled);
        assertEquals("parameter must match expected value", 42, parameterValue);

Endless variations are possible, of course, but the central point remains quite clear: a mock object can fail a test, whereas a stub can’t. Yes, the assertions in the class are clunky, but they get the message across. Apart from explaining what a mock object is actually supposed to be doing, this approach demonstrates that there’s no magic involved (although the things some frameworks need to do to get mock objects working may seem magic).

Still, why go through all this hassle to defend a word? There are two reasons. The first is general clarity and simplicity. You’ll be reading your tests differently knowing that something called “stub” will only provide indirect input, and something called “mock” will be performing verifications to check that a certain interaction has happened in a certain manner.

And there are solutions like this, of course:

Person person = mock(Person.class);

…which you don’t want in your code. (Note, by the way, that this is one of those cases where the framework makes you call a stub a mock.)

Second, more important, is that we want to avoid the domain of interaction testing as much as possible. While we certainly can design code and tests well, it’s still a fact that verifications do require knowledge of how two program elements interact with each other, and they encode that knowledge in the test. This tends to make tests brittle.

In summation, by having my trainees implement mocks and stubs “by hand” once, I hope to:

  • Emphasize the distinction between a stub and a mock (and a spy for advanced participants)
  • Show that mocks are not magic
  • Have people write clean tests where the purpose of the test double is clear
  • Reduce the number of tests that revolve around unnecessary interactions
  • Avoid clearly bad constructs (like in that Person example)

Let’s Not Turn the Test Automation Pyramid Upside Down Just Yet

A few days ago, I listened to Gojko Adzic’s talk “Humans vs Computers: five key challenges for software quality tomorrow” on Jfokus.  It was a great talk and it really gave some food for thought. This summary will not do it justice, but basically the plot was that our software is now being used by other software, and there’s AI, and voice recognition and a mix of all this will (and already does) cause new kinds of trouble. Not only must we be prepared for the fact the “user” is no longer human; we must also take into account new edge cases, such as twins for facial and voice recognition, and the fact that our software may stop working because someone else’s does. All in all, the risk is rightfully shifted towards integration and to handle it, we need to turn to monitoring of unexpected behavior. This made Mr. Adzic propose that we do something about the test automation pyramid. Turn it upside down maybe?

Personally, I vote for the test automation monolith :), or rectangle. I’ll tell you why. First, I have to admit that this talk made some pieces fall into place for me. My ambition in regards to developer testing is to raise the bar in the industry. I don’t want us to wonder about how many unit tests we need to write or how we should name them. Mocks and stubs should be used appropriately, and testability should be in every developer’s autonomic nervous system. But why? And here’s the eye opener: Because we’ll need to be solving harder problems in a few years (if not today already). Instead of, or more likely in addition to, finding simple boundary values to avoid off-by-one errors, we’ll also need to handle the twins using voice authorization to login to our software. Needless to say, we shouldn’t spend too much of our mental juice writing simple unit tests and alike.

That being said, we can’t abandon the bottom layer of the pyramid. Imagine handling strange AI-induced edge cases in a codebase that isn’t properly crafted for testability and tested to some degree. It would probably be the equivalent of adding unit tests to poorly designed code or even worse.  Yes, monitoring will probably play a greater part in the software of tomorrow, but isn’t it just another facet of observability?

So, what will probably happen next is that the top of the testing pyramid will grow thicker, maybe like this (couldn’t resist the “AI”):

Test Automation Monolith