I _really_ have to dispute the idea that unit tests score the maximum on maintainability. The fact that they are _so_ tightly tied to lower-level code makes your code _miserable_ to maintain. Anyone who's ever had to work on a system that had copious unit tests deep within will know the pain of not just changing code to fix a bug, but having to change a half-dozen tests because your function interfaces have now changed and a healthy selection of your tests refuse to run anymore.
The "test diamond" has been what I've been working with for a long while now, and I find I greatly prefer it. A few E2E tests to ensure critical system functionality works, a whole whack of integration tests at the boundaries of your services/modules (which should have well-defined interfaces that are unlikely to change frequently when making fixes), and a handful of unit tests for things that are Very Important or just difficult or really slow to test at the integration level.
This helps keep your test suite size from running away on you (unit tests may be fast, but if you work somewhere that has a fetish for them, it can still take forever to run a few thousand), ensures you have good coverage, and helps reinforce good practices around planning and documentation of your system/module interfaces and boundaries.
> Anyone who's ever had to work on a system that had copious unit tests deep within will know the pain of not just changing code to fix a bug, but having to change a half-dozen tests because your function interfaces have now changed and a healthy selection of your tests refuse to run anymore.
In my experience this problem tends to be caused by heavily mocking things out more so than the unit tests themselves. Mocking things out can be a useful tool with its own set of downsides, but should not be treated as a requirement for unit tests. Tight coupling in your codebase can also cause this, but in that case I would say the unit tests are highlighting a problem and not themselves a problem.
Perhaps you're talking about some other aspect of unit tests? If that's the case then I'd love to hear more.
I'd also add to this that people often end up with very different ideas of what a unit test is, which confuses things further. I've seen people who write separate tests for each function in their codebase, with the idea that each function is technically a unit that needs to be tested, and that's a sure-fire way to run into tightly-coupled tests.
In my experience, the better approach is to step back and find the longer-living units that are going to remain consistent across the whole codebase. For example, I might have written a `File` class that itself uses a few different classes, methods, and functions in its implementation - a `Stats` class for the mtime, ctime, etc values; a `FileBuilder` class for choosing options when opening the file, etc. If all of that implementation is only used in the `File` class, then I can write my tests only at the `File` level and treat the rest kind of like implementation details.
It may be that it's difficult to test these implementation details just from the `File` level - to me that's usually a sign that my abstraction isn't working very well and I need to fix it. Maybe the difficult-to-test part should actually be a dependency of the class that gets injected in, or maybe I've chosen the wrong abstraction level and I need to rearchitect things to expose the difficult-to-test part more cleanly. But the goal here isn't to create an architecture so that the tests are possible, the goal is to create an architecture that's well-modularised, and these systems are usually easier to test as well.
There's an argument that this isn't a unit test any more - it's an integration test, because it's testing that the different parts of the `File` class's implementation work together properly. My gut feeling is that the distinction between unit and integration is useless, and trying to decide whether this is one or the other is a pointless endeavour. I am testing a unit either way. Whether that unit calls other units internally should be an implementation detail to my tests. Hell, it's an implementation detail whether or not the unit connects to a real database or uses a real filesystem or whatever - as long as I can test the entirety of the unit in a self-contained way, I've got something that I can treat like a unit test.
At one of my previous jobs we did very few unit tests (basically only pure functions) and tons of behavior/integration tests (ie. run the service with a real database, real queues, etc. but mock the HTTP dependencies, call its API and check we get the correct result and side effects) and it was the most stable and easy to work with test suite I’ve ever seen. It was extremely reliable too.
Not mocking the database and other pipes is the single best improvement everyone can make on their test suites.
We also had a test suite that followed the same principle but started all the services together to reduce the mocked surface, it executed every hour and was both incredibly useful and reliable too.
I wanted to drop in and say we had a version of this discussion internally while I was putting this post together. Your observation about fixing a bunch of tests for a simple one line change is something I have seen as well. What we ultimately landed on is that, especially in our service-heavy environment (though not necessarily micro services), the cost of creating and maintaining integration testing infrastructure that is reliable, reasonably fast, and reflective of something prod shaped turns out to be even more expensive. Specifically, we looked at things like the costs of creating parallel auth infra, realistic test data, and the larger, more complex test harness setups and on balance it actually ends up being more expensive on a per-test basis. In fact, in some cases we see meaningful gaps in integration testing where teams have been scared off by the cost.
This isn't to say that unit tests, especially those with heavy mocking or other maintenance issues don't carry their own costs, they absolutely do! But, and I think importantly, the cost-per breakage is often lower as the fix is much more likely to be localized to the test case or a single class. Whereas problems in integration tests or E2E tests can start to approach debugging the prod system.
As with any "experiential opinion" like this, YMMV. I just set out to try to contribute something to the public discourse that's been reflective of our internal experience.
The testing-type divide feels similar to the schism around ORMs, where one camp (mine) find that ORMs end up costing far more than the value they bring, while the other claim they've never had such issues and they would never give up the productivity of their favorite ORM.
Both sides appear to be describing their experiences accurately, even though it feels like one side or the other should have to be definitively right.
I feel exactly same for unit tests and ORMs. I'd like to what kind business applications these two items really delivered. In my experience they both worked in most trivial sense. Either projects were mostly POCs and no one really cared about production issues. Or they were so overstaffed that team can continue to support to provide daily work to devs.
> I _really_ have to dispute the idea that unit tests score the maximum on maintainability. The fact that they are _so_ tightly tied to lower-level code makes your code _miserable_ to maintain.
I disagree to the point I argue the exact opposite. Unit tests are undoubtedly the ones that score the highest on maintainability. They are unit tests after all, meaning they are self-contained and cover the behavior of individual units that are tested in complete isolation.
If you add a component, you add tests. If you remove a component, you delete it's tests. If you fix a bug in a component, you both add one of more tests that reproduce the bug and assert the expected output and use all existing tests to verify your fix doesn't introduce a regression. Easy.
Above all, unit tests serve as your project's documentation of expected behavior and intent. I can't count the times where I spotted the root cause of a bug in a legacy project just by checking related unit tests.
> (...) a whole whack of integration tests at the boundaries of your services/modules (which should have well-defined interfaces that are unlikely to change frequently when making fixes), and a handful of unit tests for things that are Very Important or just difficult or really slow to test at the integration level.
If it works for you then it's perfectly ok. To me, the need for "a whole whack of integration tests" only arises if you failed to put together a decent coverage of unit tests. You specify interfaces at the unit test level, and it's at the unit test level where you verify those invariants. If you somehow decided to dump that responsibility on integration tests then you're just replacing many fast-running targeted tests that pinpoint failures with many slow-rumming broad-scope tests that you need to analyze to figure out the root cause. On top of that, you are not adding checks for these invariants in the tests you must change if you mush change the interface. Those who pretend this is extra maintainability needs from unit tests are completely missing the point and creating their own problems.
I think the whole problem is just terminology. For example take your comment. You start talking about unit tests and units, but then suddenly we're talking about components. Are they synonymous to units? Are they a higher level, or a lower level concept?
People have such varying ideas about what "unit" means. For some it's a function, for others it's a class, for others yet it's a package or module. So talking about "unit" and "unit test" without specifying your own definition of "unit" is pointless, because there will only be misunderstandings.
The "test diamond" has been what I've been working with for a long while now, and I find I greatly prefer it. A few E2E tests to ensure critical system functionality works, a whole whack of integration tests at the boundaries of your services/modules (which should have well-defined interfaces that are unlikely to change frequently when making fixes), and a handful of unit tests for things that are Very Important or just difficult or really slow to test at the integration level.
This helps keep your test suite size from running away on you (unit tests may be fast, but if you work somewhere that has a fetish for them, it can still take forever to run a few thousand), ensures you have good coverage, and helps reinforce good practices around planning and documentation of your system/module interfaces and boundaries.