Okay, so the first dev writes 60 lines of indecipherable code, runs some sample invocations, looks at the output, says it looks good. A few months later, someone - maybe the original dev, maybe some other sucker - notices that in some edge case the code misbehaves. Now what? (Obviously, any answer that involves "don't write code with bugs" or "write perfect tests" is a nonstarter)
If we're going with the "start from scratch if it ever proves inadequate" philosophy, then the person who notices the misbehavior looks at the original code, sees that it's written in some obscure language, is undecipherable, but also is only 60 lines long, and decides that it will probably be simpler to make a new (short) implementation in their own favorite language that correctly handles both the original use case and their new requirement. The key insight is that given how much easier it is to write fresh code than understand old stuff, they could very well be correct in that guess, and the end result is a single piece of small clean code, rather than a simple core with layers of patches glued on top.
In this particular case, we're talking about a "make" replacement, so testing the new implementation can be done by simply running "make all" for the project. If it passes, then the new implementation must be identical to the old one in all the ways that actually matter for the project at hand. In all likelihood, for a simple program like this, fixing one bug will also silently fix others because the new architecture is probably better than the old one.
I actually really like this approach, and have been thinking about this in regards to coding with an LLM - for a sufficiently simple program (and assuming no security concerns), once you trust your test suite, you should trust AI generated code that passes it. And then if requirements change, you should be able to amend the test cases, rerun the AI until it passes all tests and linters, maybe give the code a quick glance, and be on with your life.