Hacker News new | past | comments | ask | show | jobs | submit login
TDD Doesn't Force Good Design (aaronbruce.com)
60 points by sh_tomer on Sept 5, 2023 | hide | past | favorite | 111 comments



There's a pattern in technical blogging: take a guideline, interpret it as a guarantee or an absolute requirement, and then show it doesn't always work. Then you can have a catchy title, sounding almost like a proof by contradiction invalidating the whole guideline.

You can make a mess out of any methodology, design pattern, or language. There are always exceptions and trade-offs. You can find cases where tests don't work, types don't work, DRY doesn't work, YAGNI doesn't work, etc. but it shouldn't lead to conclusion that copy-pasting Perl is just as good as any other approach.


Howdy - this blog post is responding to folks who do claim that TDD makes it "literally impossible" to design bad systems. This is a fairly common claim among TDD advocates, and I think it makes it harder for learners and skeptics to understand the methodology.

I'm not attacking TDD, I'm a big fan of TDD, I'm just attacking the claim that TDD forces good system design. I understand in retrospect why the title of the piece reads as click-baity, but that wasn't intentional.

I only just noticed this morning that this got shared over here to HN, so now I'm gonna poke around here and find out what horrible things yall have been saying about me ;P


Keep in mind that very often, the person interpreting as a hard requirement, and the on showing it's not, are different people. There's a lot of cargo culting going on, especially on twitter.

This has only gotten worse since twitter started paying out on engagement.


Controversy brings clicks. Attacking methodologies is easy click bait.


Fairly obvious, but if you do dumb things and expect a general methodology to save you, it won't. Or, like it is said in fitness world, "you cannot out-train a bad diet".

Moreover, if you have people in your team that are not aligned with the methodology ("I'd rather be doing X", "It slows me down as a developer" etc.) the results will be sub-par. Applying a methodology requires a certain level of mindfulness which doesn't seem to be common these days.


"Hey, we aren't seeing results from TDD."

"It must be the bad attitudes on the team."

Or... TDD just isn't the panacea it's proponents make it out to be?


TDD ensures your system/application is testable, that's it, and that's all.

Seriously. TDD does not, nor ever has, said anything about the quality of the design. It merely says it will be testable. While a design that is testable may not be "the best" design (presuming some agreed-upon ordering of goodness of design, which there isn't), it is at least testable and can be proven to work as intended.

That, in the end is what this is all about. Do you go with what people feel is the best design but you can't prove the result, or do you go with what people feel is the poorer design but you can test the result and prove it works?

Make this decision as though your life and/or life savings depended on it.


It's very common for TDD advocates to claim that testable code is good code - or at least better than it otherwise would be. I don't think that's actually a wild claim.

However, I don't think TDD even ensures testability, especially in javascript/node and similar languages. You can always use goofball mocking frameworks, reflection, and monkey patching to brute force a test into working. That's what I wrote this blog post about.

I see real wonky tests a lot, and am confronted by test automation skeptics with bad past experiences, and so I wrote the blog post really for that audience and also for the people who _DO_ claim TDD forces a high quality design.


Agreed, but I’d soften the language on “proving” it works. Often we’re providing an assurance that a complex system works with the combinations of examples and inputs we’ve provided.

There are approaches to provide much stronger assurances, both on the testing and implementation sides, but they’re typically outside the scope of pure play TDD.

And I’ll take a decent assurance over the lack of one any day of the week.


I agree with your statement.

I think where TDD may not work well is with legacy systems where people try to bolt on tests after the fact.

I get the impression from the article that this might be the case.


If legacy systems weren't designed to be tested then it can very well be that they're very difficult to test. I've never seen anyone try to apply TDD to projects modifying those systems. Seems like it'd be an exercise in frustration!


There's a great book by Michael Feathers on this very subject called "Working Effectively with Legacy Code".

He takes a strong "test first" approach and talks about all the work you have to do when working on a system that isn't testable to change it safely with TDD.


Thanks for the info! I'll check it out.


I have absolutely seen high test coverage crystallize bad design. You are less likely to rewrite something tested and proven to work.

I have also seen high coverage that didn’t actually test much of anything. Code with side effects, none of which tested, just more or less that the method ran without a runtime exception.

Coverage is a good tool but a bad metric.


You are less likely to rewrite something that is proven to work and has absolutely no tests. At least in any company, which relies on the service (because chances to break something are high). Bad (or very "smart) design usually crystallizes on its own - it is very risky to touch it even with tests.


>You are less likely to rewrite something that is proven to work and has absolutely no tests.

Thats emphatically not true if the tests are very low level unit tests that tightly couple to implementation. Theyre a greater impediment to refactoring than 0 tests.

Ive worked on a couple of projects like this and seen the same crystallization effect as the OP where people were visibly more afraid of touching unit tested code.

These type of tests A) never catch a bug but B) will always fail when you change the code for any reason and C) theyre intimidating to maintain.

That means if you refactor the code you usually need to test it just as much manually on top of making the useless tests go green, following whatever dumb pattern they were written with. That increases the workload of changes on that chunk of code by 10-40%.

Strictly speaking these types of tests are better off deleted but mid level/junior developers and even many seniors who ought to know better will balk at it due to loss aversion.

Ive noticed that this is also the type of bullshit test ChatGPT will write when cajoled so I anticipate seeing much more of this in the years to come.


If tests are very low quality it is the same as no tests. (so recursion here).

If tests are low quality, they somehow managed to pass the code review. So, I would have more wider questions to the project, knowing this.

Adjusting tests require 10% during refactor? Yes, it is fine (getting some uncovered rare regression bug in prod is not fine).

Manual testing after refactor? No, it is not fine (and manual testing should be automated before the refactor. With end-to-end/integration testing. In the ideal world).

Sadly, it all depends on managers, many projects are just "Just doo it", so nobody cares about keeping the tests sane.

Oh yea, flaky, weird, non-documented tests would increase load by 40% to the new devs and that is true evil.


>If tests are low quality, they somehow managed to pass the code review. So, I would have more wider questions to the project, knowing this.

The answer to this is almost always a lack of skill or that the team was focused on delivery and were willing to neglect test quality in pursuit of that goal.

Either way it's a common enough situation that can't just be handwaved away. Even if you've fired the team members responsible for that code the code doesnt just go away.


In my experience, test code has much more volume than the implementation in a lot of cases, but clean coding practices and maintainability go out the window; I often can't make heads or tails out of a unit test hundreds of lines long, so I become hesitant to change the component or its tests. Which defeats the purpose.


People have this aversion of treating code for tests as production code.

They spend time designing the architecture for whatever they are implementing but when it comes to writing tests, all of that stuff goes out the window and people just do whatever takes the least amount of time.

If you start treating tests as any other code, it should get easier to change as your implementation changes, and you'll get a lot more value out of automated testing.


> If you start treating tests as any other code, it should get easier to change as your implementation changes, and you'll get a lot more value out of automated testing.

No, please, don't treat test code as any other production code. Test code can have duplication throughout if it helps readability, test code that relies on abstractions, utility methods, etc. are hard to read, which makes the test cases themselves hard to understand.

Test code should be easy to read a case, understand what the test case is preparing for (I like to add comments marking given/when/then sections to make it very clear), what is being called, with what parameters, and what exactly the expected result is.

I had my fair share of interactions with test code trying to be smart, test code abstracting away the assertions (into utility methods that themselves call other utilities), test code trying to avoid duplication, and causing headaches and requiring troubleshooting of test code... That is not the reason for test code to exist, it's not to be clean and concise. It's test, you need to know what it is doing, and very quickly assess what failed when a case fails.

Please, don't overengineer test code, it should be elegant but there's no need to write it as you would write production code, write it as documentation.


Disagree :) In my experience, test code that has duplication everywhere makes it harder to reliably change the code under test, as there are so many tests that have to be rewritten and you're basically guaranteed to fuck something up.

I'm not arguing for making the test cases harder to read, it's very important to be able to see input/output/expected output as always, but it's also important that you're able to change the test code confidently, and duplication actively works against that.

I guess the "truth" or "optimal state" is somewhere in-between. Not 100% duplication but also not 0% abstractions.


> I guess the "truth" or "optimal state" is somewhere in-between. Not 100% duplication but also not 0% abstractions.

Exactly, I've seen way too much test code that abstracted away a fundamental piece needed for understanding the test setup, which made it much harder to reason about when tests failed. Jumping through the test to the utility used to avoid duplication is another layer of cognitive load, I'm already trying to troubleshoot something, I don't need more things to keep track of while I'm in that process.

Rewriting test code is exactly the moment you take to abstract some of it away, when it's been proven that the pattern repeats and is annoying to rewrite. The usual rule of 3 of software engineering.

> it's also important that you're able to change the test code confidently, and duplication actively works against that.

I disagree with this point a little, if the tests are well written (even if verbose) it should be easy to reconfigure/rewrite them, even with dozens of test cases, it might take a bit more time to rewrite those than just changing a supplier/utility method but it has helped me a lot to actually read the tests. I've been burnt too many times by someone abstracting away the actual call (or a part of it) to the logic under test into an utility to "avoid duplication", that call is the whole point of the test case. One should be able to understand every test case easily so rewriting to accommodate the change made should be straightforward, if it's not then the test case itself is telling there's a code smell, either in the test or in what is under test.

The more egregious to me is when tests are a mess of setups not following a simple, straightforward structure. It becomes hard to reason and change over time when the inevitability of growing code happens. Duplication or not won't be the main issue, the structure itself will.

I will abstract away some complex setup/adjacent code that isn't under test but anything that the test cases might interact with, and fail because of it I do prefer to be extremely explicit and duplicate code instead of trying to make my tests look tidy and neat.


A mantra I usually follow is "If you're putting something under 'utility'/'utilities'/similar names, you're cheating on your design and you're only putting it there because you can't figure out a better place, think longer/harder.". Applies equally for testing code :)


I've been challenging myself in a current work project. I don't want duplication in tests and I want most methods to have zero setup. The high level tests ideally should just start with a get-request and go straight into expectations.

For one, this has resulted in a curious structure. I have a 5 or 6 fixtures with representative sample datasets used in production and these get extended as new features are introduced. If some new addition breaks everything, we'll know immediately, as all tests use the same fixtures and it's all just one place. And it's also instructive during design. These fixtures give you 3-4 situations to think through when adding something new.

And the tests are pretty. Most of them have been boiled down into one test_client request and a comparison with a dataset, just 2-3 logical lines. This makes is easy to see if tests are still correct, necessary. Or to just talk about if they make sense overall, even with less technical or involved people.


There's no mechanical process from which blind adherence could produce good design without the author/engineer having experience or knowledge (or both).

It just stops bad design from setting in too easily - and even then, it's "easy" to workaround or apply the process incorrectly or too pedantically.


As you say, TDD can make it "less easy" for a bad design to settle in - if you pay attention. If it's hard to write the test, it may be telling you something about the design. If you listen, that can push you toward fixing the design. If you don't listen, well, you're just going to write the tests that you're told to write, and probably grumble about it, but that doesn't help the design at all.


> If you listen

requires that you know what to listen for.

I've seen junior developers write terrible tests. The thing is, to know what a good design looks like require experience. To know that the test was hard or bad requires experience, or at least knowledge of what makes a test good. If you've never seen a good test, you might think the test you wrote was good, and the pain is just normal.


This is exactly what I'm getting at in the blog post. I think the message that it works "if you listen" needs to be front and center so that learners and skeptics can figure out how to apply TDD in scenarios where they don't have any hands on guidance.


Most solutions on their own don't improve things a whole lot. Yet, in a system of supporting practices, it can be very powerful. The primary thing is you need a system, not just the individual parts. Testing without changing the design of your code is a horrible experience. Applying techniques like dependency inversion/injection has a positive effect on isolating behaviour which makes testing easier. Making code more deterministic makes tests easier. Pushing out side-effect from your core logic makes testing easier. All of those things add up to more than the sum of it's part, which is the an indication of dealing with a system.


I'm fine with encouraging proper tests. On a base level, TDD encourages tests, so it's overall fine, at least in principle.

My main gripe with it is the second-order concern that it encourages testing practices that are frankly not very intelligent.

How you should approach testing depends on what kind of function you are testing. Pure functions of their inputs should be tested with property-based tests. If you have

    bool isPrime(int n)
the whole rigmarole of "make a test that fails, then write the least amount of code that makes it pass" brings you to something like:

    assertFalse(isPrime(1));
    assertTrue(isPrime(2));
    assertTrue(isPrime(3));
    assertFalse(isPrime(15));
and so on, where what you really want is to say something like:

    for all 1 < i < n . n % i != 0
This obviously works less well when you have to deal with the real world, but even in that case TDD leaves you with a patchy and inflexible approach.


TDD works just fine with property-based tests, with each case representing an equivalence class. I like to randomly select from those classes because that eventually double-checks that I set my boundaries correctly. I often additionally pin the class boundaries in place.

https://en.wikipedia.org/wiki/Equivalence_partitioning


Interesting, could you say more about how you go about this? I’m in the middle of figuring out how I want to do property-based testing for a parser, using fast-check.


Dependency injection is only useful when you've managed to isolate logic/math intensive code. Some apps dont have any logic intensive code. Many others have very little.

In those cases dependency injection just increases the SLOC with the payoff that you are able a bunch of trivial unit tests that'll probably never catch a bug.

Integration tests as a default have a the best ROI in those cases.


I look more towards optionality myself. Take for example the encapsulation of randomness. Depending on an abstract notion of randomness (an interface) that decouples you from the implementation is both useful for testing as it is for maintenance of a system. For tests, removing randomness entirely makes it deterministic, allowing you to tests for exact matches instead of approximations. For systems, at a smaller scale get away with a reduced amount of randomness, while systems at scale require more sophisticated code for this. You don't want to replace all of that code in all instances, but rather leverage the capability and replace the implementation.

Another angle is the encapsulation of storage. Using in-memory storage for tests makes Ci pipelines very quick and production storage might evolve over time to accommodate scaling requirements (sharding and such).


You don't have to inject every dependency. If it's essentially a pure function, then just call that function (despite what old-school Java advocates, free functions not tied to any object are fine).

I like dependency injection for things which have state and/or do IO and/or are expensive to construct.


This isn't really true. For one, dependency injection isn't specific to testing or processing logic, it's an architectural approach used primarily for managing separation of concerns and modularisation.


>it's an architectural approach used primarily for managing separation of concerns and modularisation

That always feels like a post hoc justification to me. If you demand 100% test coverage on a Java/C# project then you will almost certainly end up with some kind of dependency injection at the end. Whether that is useful or not is often debatable.


Dependency injection where the injected dependencies are interfaces - particularly, interfaces designed to provide the minimal external requirements of the module - has a broad architectural impact on the design and implementation of a system than just enabling unit testing. It enables isolation, decoupling and modularisation and clarifies the system interactions.


> It enables isolation, decoupling and modularisation and clarifies the system interactions.

That’s what Spring claims but the truth is that using interfaces enables those things. Dependency injection might encourage the use of interfaces, but it also inserts an abstraction between tests and the things you want to test.

In my experience dependency injection frequently obfuscates system interactions and invalidates any assurances you might get from good unit tests.

Dependency injection is just a tool. It can be useful but is often misused and doesn’t perform magic.


You can do dependency injection without a DI framework/container, just declare the required dependencies for a class as required parameters in the constructor and build your own dependency tree manually when constructing the objects in `main()`-equivalent.

Using a DI container is not required to implement the DI pattern, it's just a tool that facilitates some more complex DI.

I tend to prefer to keep my projects simple enough that manual DI is quite appropriate and readable.


As the other commenter points out, I believe you are conflating DI implementations with Dependency Injection as a methodology. But I suppose I am too, because really I'm talking about using dependency injection to achieve Dependency Inversion - the D of the SOLID design principles.

There are good and bad implementations of dependency injection (or, perhaps, appropriate and inappropriate depending on requirements), but the principle of dependency inversion applies at an architectural level, before specifics of implementation and things like unit testing come into play


I didnt say it was specific to testing. Nor did I say it was never appropriate. Nonetheless the pursuit of "unit testability" is probably most often why it's done.


> If testing your code is hard it means your code needs to be factored better. Stop what you’re doing and fix the design so that testing is easy again.

Then again, another central selling point of having tests is that it enables refactoring without having to fear that you change the behaviour of the code. So, if you have a codebase without tests, you should write tests while changing the code as little as possible, and then refactor it.


There's a parallel in the design world that remains under-discussed. In my journey as a small studio owner, collaborating with numerous startups, I've found that raw experimentation, while valuable, isn't the only path to innovation. A startup can conduct endless user interviews, much like a developer can persist with TDD, but at some point, the sheer weight of expertise begins to eclipse the utility of trial and error.

There's a pattern to human behavior, just as there is in company challenges; a sort of predictability within the myriad of unique use-cases. So, as much as startups are evangelized about "talking to users", there's an underestimated power in simply hiring seasoned talent.

I for example have a decade+ worth of experience designing B2B SaaS products. Many challenges have familiar contours and are solvable without the rigmarole of repeated experimentation. Just as TDD can be misinterpreted as a silver bullet for flawless code, so can raw user feedback for impeccable design. The nuance, often, lies in the marriage of systematic process and accumulated wisdom.


Historically, those of my (non-delusional) colleagues who made poor UX/API design don't defend it being good. They say things just had to get done. Sometimes management incentives tilt towards making poor solutions (deadlines used to justify a hasty job), other times developers just don't care.


>other times developers just don't care

I've seen that far more often than management pressuring people into writing bad code.


> Especially in languages like javascript you can do a lot with mocking frameworks and dependency orchestration hacking to make bad code work.

This is precisely why I do everything I can to block usage of Jest, and more generally, another reason we need to sunset CommonJS. People just abuse this stuff. There is a fine line in our craft where your tools need to hinder convenience, hence why we have much discussion around strongly-typed vs weakly-typed, why TypeScript has gained so much traction, etc.

Vast majority of 'business' stakeholders are going to exert downward pressure for the most convenient option, but the consequence isn't merely this or that quality of code design, but that subsequent generations of developer _have no chance_ of developing actual system design skills, evaluation of patterns, etc. And why should they? Use Jest to mock all the things, slap the 100% code coverage label on it, and start moving the next ticket to 'Done'.


Could someone elaborate on why “mocking is a smell”? Is it opposed to running the whole system with real interactions?

I find it a nice shortcut, but true as well than the mocking behavior can be non existent in the real code path.

Still, I reach out to mocking libraries often. Is that a bad practices and why ?


Mocking, when it is a large part of your test setup, makes your test focus on the blueprint of the implementation rather than the outcome of a unit of behaviour. This causes the tests to break relatively soon when the implementation changes. When a test does not use mocking and rather treats the code as a sealed box, the tests are more stable. Stable tests yield better guarantees for stable outcomes even when the underlying code is (radically) changed/refactored. Using mocking frameworks usually makes refactoring harder, as changing the underlying code immediately invalidates the test. Changing the tests and code in unison makes changes less safe.


I must add that it's mostly mocking libraries that auto-generate mocks that are the culprit here. They're quick to introduce, but once introduced make it harder to change the implementation as the implementation logic is now replicated in N instances of tests. More surface area, more friction for change. Many mocking libraries also allow you to fixate behaviour that is incompatible with the actual implementation, especially in dynamically types languages.


Mocks have their uses, but overzealous overmocking really means the death of a test suite. At this point, tests become more of a nuisance than anything helpful (especially in a statically typed language where you know that if your code compiles the pieces at least sort of have to fit together).

Mocks (or related techniques such as stubs, fakes) are useful when you have to interact with an outside system (e.g. send emails), for testing boundary conditions that are hard to replicate (what happens if an error is thrown), for highly generic code or, potentially, for really major architectural boundaries in your code base (but I feel that most people overestimate the confidence they have that they've determined the correct boundaries).

For anything else, I'd prefer to isolate pure code from code with side effects as much as possible so that the former can be unit tested without any dependencies (even if one class/module/whatever calls another, it doesn't have to be mocked out), and the latter is mostly just wiring that can be tested through a combination of integration testing and maybe some occasional use of a mock here and there.

Unfortunately not many code bases are structured that way, so you'll have to pick your battles and maybe live with more (brittle) mocks than you'd like - or with slower and more flakey (integration) tests that you'd like.


I have an extremely hard time reading & benefitting from tests that make use of excessive mocks. Some apps I've worked on recently have mocked out so many things, large parts of the code base don't actually get hit by the actual tests. It's super annoying. To be honest I can't even make sense of it most of the time.


I felt that. But also : how do you deal with testing interaction with an external API. Or code that you control but require a large effort to instanciate? ( you want to test f but need to standup a,b,c,d and e. I find it tempting to mock e and call it a day.


I think it's fine to mock out a call to an external service (e.g. your app making calls to AWS S3). But when you mock out too many calls to code in your own app, that's brittle in my opinion. For example, say your testing the "Foo.new.().foo()" method, and you mock out calls to "Bar.new().bar()". You could make a breaking change in "Bar.new().bar()", say you go from returning "bar" to "BAR". The test in "Foo.new.().foo()" would keep passing while testing the wrong value. Now if you do that all over your application, you can have a nicely factored test suite that is completely detached from reality.


I don't do TDD - so this is my outsider's understanding

The issue is that if you need to mock then your design is not really properly decoupled. The mock bakes in assumptions about your program state - which may or may not hold up in the real world. This ends up creating a sort of invisible coupling to whatever will be creating/updating the state in the real world

The terminology is a bit fuzzy b/c the author describes his unit tests as using mocks - but to me that's not a unit test - that's an integration test..

At the end of the day you do need some coupling and some integration tests are inevitable - but the idea is that TDD pushes you to minimize that

Would love to hear anyone's corrections :)


One issue with mocks is that you have to ensure that the way they are mocked accurately represents the actual behaviour of the thing being mocked, in production.

Otherwise your mocks may end up driving the system to a state that is never actually encountered in production, while the unit test still passes. When you run an integration test or a real build of the application, the actual system itself will fail to behave as expected.

The mock implementations can diverge from the actual production behaviour.


I think it depends heavily on what you're mocking and what the code you're testing looks like.

I think Martin Fowler has a nice dichotomy where he splits tests up into solitary (generally using mocks for deps) and sociable (generally using real code for dependencies): https://martinfowler.com/bliki/UnitTest.html


There's a smell in the code but it's not the mocks.

I've seen people have problems with mocks when they have poorly designed code that demands Byzantine test setups. The "bad mocks" smell usually means the production code is due for a refactor. (So let's hope you have behavioral tests and not implementation-only tests to help everyone refactor.)


Does your user run your software with stuff mocked out? The reason for testing is getting ahead of issues before the user finds them.


> The reason for testing is getting ahead of issues before the user finds them.

The customer always uses software in some unexpected way, but we don't defer all "testing" to just watching the customer interact.

Testing is to give developers the fastest, most meaningful, most actionable feedback about their code changes so that they can make better decisions.

Customer issues are not the only things we're worried about. And software seems to be usable and make plenty of money even when users see something screwy once in a while.


I think I had my response : I use mock mostly for integration tests. And those are not in the scope of this article


It's not TDD that I don't like it's the zealously of its proponents.

In any case, TDD helped trainers put their kids through college, now the hype is over and we can be productive again.

PS: thanks god, mainstream pair programming is dead too.


To be fair there are plenty of anti TDD zealots as well.


Pure TDD as originally described 'write the minimal function to satisfy the test' does not actually encourage any design at all, just throwing code together on the fly.


Then it's about designing the tests.

Maybe that _is_ a good idea...


Definitely, this...


"The claim that automated testing and TDD forces you to produce better designed systems isn’t strictly true"

Has TDD become so sacred that we need to beat around the bush?

TDD quite often in fact forces bad design, and combined with agile and especially scrum, often 'forces' myopic architecture too.

There, I said it.

Some kind of specification and testing is still way better than none at all, but current practices are not without potentially deep problems.


Agree. I saw a codebase with extreme tdd and the result is that it could achieve an extreme amount of mocking (and it did). The design was bad, but very testable.


To quote Kent Beck of Extreme Programming fame [1]:

--- start quote ---

I get paid for code that works, not for tests, so my philosophy is to test as little as possible to reach a given level of confidence

--- end quote ---

In my opinion, TDD as proposed by testing zealots is harmful. Instead of testing a unit of work testing frameworks and methodologies all but force you to test units of code, and that is not the same thing.

E.g., if your code responds with a list of items from a database, you should test that. Instead, people test the controller in isolation (mocking everything), the transforming function in isolation (mocking everything), the database layer in isolation (mocking everything)...

No. You have to test that your code, when called, returns a list of items. If you can't have the database, mock that, and only that. Boom, you have 1500% fewer tests, with the same coverage, and an actually increased level of confidence.

[1] https://stackoverflow.com/a/153565


Yeah, I've found I'm starting to write fewer, and coarser tests and it is good for the code base. You can always augment them with bugs later on. And if you can isolate the main logic of a system into small components extended by addition over modification, even better.

In most of the small systems we're developing at work, I tend to have 2 groups of tests: One testcontainer driven test testing the actual database layer. These tests are very simple and small - migrate the database, call insert a few times, load objects back and compare them. Nothing more. And the other tests are usually more of the nature of "Given these elements in the database, expect that output for a REST call", augmented with a few tests of some fiddly functions.

The other thing I meant occured for example in a stream processor, which needed to apply different transformations to incoming messages for normalization, default values, precomputations and so on. Here we reached a point of confidence where each of these processors was tested as a stream to stream mapping in isolation. It was pretty and it worked wonderfully and without issues. If you can achieve this kind of system of pretty much functional components and extension by addition, that's very much a pinnacle of testability.


> You have to test that your code, when called, returns a list of items.

What do you do when the list it wrong? (eg, not transformed/serialized/sanitized/whatever)

Sometimes, manual tests show it's not the expected/correct data from the first run; or later someone adds a line of code (adds a period to a random row). The test runs happy, even thought the data is now incorrect/unexpected.

Moving something that can be automated with mocks, to manual processes, is a bad tradeoff.


> What do you do when the list it wrong? (eg, not transformed/serialized/sanitized/whatever)

What do you do when that happens with "regular" unit tests that test every single piece of code in isolation?

You fix it.

> Moving something that can be automated with mocks, to manual processes

Who said anything about manual processes?


> Who said anything about manual processes?

It's either automated or manual.


What is?

Where in my comment did I say anything about manual process? Please re-read what I wrote.


> What do you do when that happens with "regular" unit tests that test every single piece of code in isolation? > You fix it.

If you don't find it via an automated means, it's via a manual means. The implicit assumption is that the error is found, regardless of where that happens, in dev, test, prod, or your env de-jour. Good luck with whatever.


> I experienced this first hand when I first started learning to do TDD with unit tests. We had a badly designed system and we adopted a unit testing strategy that said every class needed to be tested independently from every other class. Testing this system was hard. It was a lot of effort with mocks and dependency wrangling to exercise our classes, and it produced brittle tests that never added any value in terms of the quality of the overall system.

> We were writing bad tests in a badly designed system and having a horrible time. The conclusion we came to was that this testing thing was for the birds. It was slowing us down and not really adding any benefit. Get rid of it.

If the tests are revealing that the system is "badly designed," the solution is not to throw away the tests in my view - it's to work on gradually refactoring the system to address the difficulties with testing it (of course subject to developer time/budget constraints...). The tests provide a gauge of the "internal quality" of the system (to take a concept introduced by the text "Growing Object-Oriented Software, Guided by Tests" [0]).

xUnit Patterns has published a list of "test smells" and possible defects they may indicate in the design of the system: http://xunitpatterns.com/Test%20Smells.html

To be fair the author does call this point out in their conclusion:

> If testing your code is hard it means your code needs to be factored better. Stop what you’re doing and fix the design so that testing is easy again.

I would also add that adhering to the testing pyramid [1] principle is important, such that your system is not overly reliant on mocks (which can diverge from the actual "production" behaviour of the system) and/or brittle unit tests that break easily when changes to the system are introduced.

[0] https://www.google.ca/books/edition/Growing_Object_Oriented_...

[1] https://martinfowler.com/articles/practical-test-pyramid.htm...


>If the tests are revealing that the system is "badly designed," the solution is not to throw away the tests in my view - it's to work on gradually refactoring the system to address the difficulties with testing it

I'm mystified that people keep suggesting this given the glaringly obvious chicken-egg problem inherent in doing it.

If you think you can't safely refactor without reliable unit tests and you can't get reliable unit tests without refactoring then you are stuck.

The ironic thing is that on the projects Ive dug out of a "not safe to refactor" quagmire (with high level integration tests), I always used to have an end goal in mind of refactoring to make the code more unit testable in the end. It felt like the "right" thing to do. That was just the industry dogma talking though.

In practice by the time I reached the point where we could do that there was usually no point. Refactoring towards Unit testability had a low/negative ROI at that point, with code that was already safe to change.


> If you think you can't safely refactor without reliable unit tests and you can't get reliable unit tests without refactoring then you are stuck.

Books have been written on exactly this problem; Michael Feathers' "Working Effectively with Legacy Code" comes to mind, where he explicitly names this problem as "The Legacy Code Dilemma:"

> When we change code, we should have tests in place. To put tests in place, we often have to change code.

It's not exactly an easy problem (hence the existence of the book), but there do exist techniques for getting around it - speaking very generally, finding "seams" (places where the behaviour of the system can be modified without changing the source code), breaking dependencies, and gradually getting smaller and smaller units of the system into test harnesses.

Sometimes the code does need to change in order to enable testing:

> Is it safe to do these refactorings without tests? It can be. [...] The trick is to do these initial refactorings very conservatively.

Martin Fowler has catalogued [0] some of these "safe moves" or refactorings one can make; Feathers also recommends the use of tooling and automated refactoring support (provided one understands how safe those tools are, and what guarantees they offer) in order to make these initial refactorings to get the code under test.

Whether or not this is actually worth the time invested, is another matter, and probably one far more complex.

[0] https://refactoring.com/catalog/

[1] https://archive.org/details/working-effectively-with-legacy-...


>It's not exactly an easy problem (hence the existence of the book), but there do exist techniques for getting around it - speaking very generally, finding "seams" (places where the behaviour of the system can be modified without changing the source code), breaking dependencies, and gradually getting smaller and smaller units of the system into test harnesses.

These techniques are on the right track but they are outdated. They advocate making changes that are as small as possible (hence the seams thing) while covering the app with unit tests. Most of the advice centers around this maxim: change as little as possible.

What's the smallest amount of change you can make? Zero - i.e. running hermetic end to end testa over the app and changing no more than a couple of lines.

They dont advocate zero though. They advocate unit tests.

To be fair to these authors that option was not really viable when they first wrote the book. Instead of 200 15 second reliable playwright tests run across 20 cloud based workers completing in 3 minutes you were faced with the possibility of 24-48 hour flaky test suites running on one Jenkins server.

So, it probably made sense more often to take slightly larger risks to crack open some of those seams in pursuit of a test that was less flaky and ran in under a second rather than 2 minutes.


> To be fair to these authors that option was not really viable when they first wrote the book. Instead of 200 15 second reliable playwright tests run across 20 cloud based workers completing in 3 minutes you were faced with the possibility of 24-48 hour flaky test suites running on one Jenkins server.

The problem with integration and e2e tests is that they do not give you a measure of the "internal quality" of the system in the way unit tests do. Quoting again from "Growing Object Oriented Software, Guided by Tests:"

> Running end-to-end tests tells us about the external quality of our system, and writing them tells us something about how well we (the whole team) understand the domain, but end-to-end tests don’t tell us how well we’ve written the code. Writing unit tests gives us a lot of feedback about the quality of our code [...]

I don't think compute / test duration were the only motivations behind this approach.


If I inherited a crappy code base I want to get to a place where I can safely refactor as quickly as possible.

I dont need indirect commentary on how crap the code base is in the form of tests that are annoying to write because they require 100 mock objects. It's not telling me anything new and it's annoying the hell out of me while it does it.

If the codebase is good, I also dont need that commentary. I can read.

Indeed maybe the real message that unit tests are sending by being intolerant of bad code is that they are also the bad code.


Huzzah!

I'll add that the meme of preferring integration tests to any other tests seems to stem from badly designed code. Looking at production code less closely disguises many sins.


To be fair, while I agree with you, integration/e2e tests are much much easier to introduce into a legacy system, and it's really really easy to break things at the edges so they are definitely useful.


nothing reliably forces good design, or good code. a shitty team will design a shitty application full of shitty code regardless of testing methodologies or anything else. what else is new?

does that mean that literally all development processes are bad and should be dismissed and everybody should just open their editor and start writing, organisation be damned? seems that way if you read comments on literally any post on hackernews.


I agree that there is nothing that forces good design, but if you force people to take more time to think about the design, it increases the likelihood of good design.

Problem is that people don't either take the time to think about it, or they're not given enough time.


I've seen code bases where a lot of the tests are actually testing the underlying framework's code (i.e. dotnet framework) after they mocked out all the dependent services. In my mind unit tests are a waste of time except for testing really complex algorithms where the there are no side effects. Automated end to end QA testing is where the effort should be spent.


Unit tests are mostly for hitting branching logic within your prodedure/method/function (and private ones in your codebase that it calls but can't be tested independently, e.g. they're private), and so are naturally paired with code coverage. If your percent coverage is high and someone changes the implementation, a drop in coverage and/or failure of one or more unit tests can signal that some assumptions about the behavior may need additional consideration.

End-to-end testing is important too, but it may be difficult to ferret out tricky bugs if unit tests and coverage for your modules (or whatever your lang/kit calls them) don't tell their own story about what code is being exercised and how.


I don't really know where such claim came from. TDD is really about getting features tested. I never heard similar claims before.


"I never said that test-first was a testing technique. In fact, if I remember correctly, I explicitly stated that it wasn’t." -- Kent Beck, September 2001.

Similar claims have been part of the TDD tradition since its very beginning.


Reusable modules of code involve implementation and the interface.

If you just write out an implementation without thought for how others are to use it, there's a risk that you'll end up with an interface that's more complex to use than it needs to be.

The idea is that by 'using' the interface of the module first (e.g. by writing tests first, or by writing documentation first), then the design of the interface isn't left to be an afterthought of the implementation.

Similarly, it's easier to consider edge cases when thinking about unit tests / documentation, etc.


That’s amazing because people have been arguing about TDDs impact or lack of as a design technique for over a decade!


Inside-Out TDD doesn't necessarily drive your architectural design, that's right. But Outside-In TDD can do so.


TDD is a tradeoff.

TDD improves good design *for the user* at the expense of implementation design *for the coder*.

Because you write how the code is going to be used first, you're completely free of implementation constraints, so you can make it as easy as possible for the user to use your library/app/whatever. This makes it harder for you because now you have implementation constraints that you didn't necessarily have if you started without TDD.

That's the whole point of TDD.


> We had a badly designed system and we adopted a unit testing strategy that said every class needed to be tested independently from every other class. Testing this system was hard. It was a lot of effort with mocks and dependency wrangling to exercise our classes, and it produced brittle tests that never added any value in terms of the quality of the overall system.

Sounds like the codebase is the problem, not the tests. Too much tight coupling.


I feel like unit tests slow down a framework-heavy application, like an Angular app. The more you refactor and split, the more your unit tests will look like

FakeAsync Fakeclick MockService was called with FakeParameters

Doesn't really add much value over

click triggers function calls service


I think the simple claim that TDD improves design isn't strictly true, and turns some people away from automated testing.

Never really heard this claim before ?


It's a common fallacy among some software developers that since TDD forces you to write small units of code (since you have to write small unit tests then naturally the units you test will also be small) then your code will be easy to maintain. It also forces dependency injection in OOP and a couple other design patterns. I've seen code written as if by one's feet without unit tests that worked better than code with 90% test coverage. In a few cases tdd was the go to practice for sloppy devs. Somehow they thought that writing tests will account for lack of skill. But guess what? Tests were sloppy too.


I don't think anybody could claim TDD will make a sloppy dev write good code. Though there's half an argument for getting more experienced/skilled devs to write the tests and the more junior or less capable ones to fill in the actual working implementation (but I'll be honest, I've never seen this work in practice over a prolonged period).


> (but I'll be honest, I've never seen this work in practice over a prolonged period)

Me neither. I think that can only work when the same thing has been done countless times before. But at that point you can just use someone else's work, either via a library or an off the shelf product, instead of reinventing the wheel. And then you are left with working only on something that's new. Which invites the question - how can you test that which you don't know you need? In the context of unit testing that is.


If tests are written first, (_and_ the dev has a good eye for design, which is not always given) then it is likelier that you will write a more ergonomic interface (e.g. methods having smaller number of parameters/dependencies, "intention revealing names", [0] etc..).

In a unit-testing paradigm, this also means that you will need to inject the unit's dependencies so that it can be driven into specific states by each test in the suite, encouraging dependency inversion and the open/closed principle (as given by Uncle Bob [1]; note that with dependency inversion it becomes possible to vary the behaviour of the unit in test vs. production environments simply by specifying different dependencies without modifying the unit's code itself). If the dependencies are too cumbersome or even impossible to set up, then the test reveals something about the design (e.g. high coupling).

Of course, it's also possible to just grin and bear the pain of tests being too difficult to write, without actually resolving the design problems revealed by the test...

[0] https://wiki.c2.com/?IntentionRevealingNames

[1] https://web.archive.org/web/20060822033314/http://www.object...


I heard this a few times in the ruby community by some well known figures like Sandy Metz. Simple example : time related tests can be cumbersome to setup, but if you inject time in the method, test becomes easy, incidentally improving your design by not relying on globale objects...

But I am sceptical this can work without knowing and practicing good design principles (SOLID) independently from tests !

However I believe tests are super important for refactoring, wether the design is good or not. My 2 cents : if the design is bad, (spaghetti and all) tests should be done more at the functional level, when your confident design is getting good then rely more on unit test


Came to say this, never heard this before. I have heard the claim that it improves quality, but even that is debatable. Im sure the most basic claim that can be proven is it often increases test coverage.


Improvement in quality is improvement in design, since having less bugs is one of the desired properties of good design.


A bad design may work equally well when compared to a good design, given it's patched enough.


Totally disagree. A bad design can pass all tests and appear high quality, then crumble under high load. Even quality is subjective. Most people would assume it means being good at every aspect, but passing tests doesn't not guarantee that.


Unless the bug fixes are so fragile and/or convoluted they make any change in business requirements almost impossible to implement.


> Never really heard this claim before ?

Neither did I. I guess it really depends on what the code would be like without having the constraints to make it testable. A big ball of mud component hardly qualifies as good design, and TDD definitely dissuades people from that mistake.


A big ball of mocks and ten thousand layers of abstraction is so much better!


If the unit of code you want to test needs a big ball of mocks and ten thousand layers of abstraction to be testable, your code is already horrible and TDD was not the cause.


It is a common claim made by TDD proponent. The reasoning is that it gives you a better perspective of the external facing apis you are developing because you are forced to use them when writing the tests.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: