Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I've been thinking about the notion of "reasoning locally" recently. Enabling local reasoning is the only way to scale software development past some number of lines or complexity. When reasoning locally, one only needs to understand a small subset, hundreds of lines, to safely make changes in programs comprising millions.

I find types helps massively with this. A function with well-constrained inputs and outputs is easy to reason about. One does not have to look at other code to do it. However, programs that leverage types effectively are sometimes construed as having high cognitive load, when it in fact they have low load. For example a type like `Option<HashSet<UserId>>` carries a lot of information(has low load): we might not have a set of user ids, but if we do they are unique.

The discourse around small functions and the clean code guidelines is fascinating. The complaint is usually, as in this post, that having to go read all the small functions adds cognitive load and makes reading the code harder. Proponents of small functions argue that you don't have to read more than the signature and name of a function to understand what it does; it's obvious what a function called last that takes a list and returns an optional value does. If someone feels compelled to read every function either the functions are poor abstractions or the reader has trust issues, which may be warranted. Of course, all abstractions are leaky, but perhaps some initial trust in `last` is warranted.



> A function with well-constrained inputs and outputs is easy to reason about.

It's quite easy to imagine a well factored codebase where all things are neatly separated. If you've written something a thousand times, like user authentication, then you can plan out exactly how you want to separate everything. But user authentication isn't where things get messy.

The messy stuff is where the real world concepts need to be transformed into code. Where just the concepts need to be whiteboarded and explained because they're unintuitive and confusing. Then these unintuitive and confusing concepts need to somehow described to the computer.

Oh, and it needs to be fast. So not only do you need to model an unintuitive and confusing concept - you also need to write it in a convoluted way because, for various annoying reasons, that's what performs best on the computer.

Oh, and in 6 months the unintuitive and confusing concept needs to be completely changed into - surprise, surprise - a completely different but equally unintuitive and confusing concept.

Oh, and you can't rewrite everything because there isn't enough time or budget to do that. You have to minimally change the current uintuitive and confusing thing so that it works like the new unintuitive and confusing thing is supposed to work.

Oh, and the original author doesn't work here anymore so no one's here to explain the original code's intent.


> Oh, and the original author doesn't work here anymore so no one's here to explain the original code's intent.

To be fair, even if I still work there I don't know that I'm going to be of much help 6 months later other than a "oh yeah, I remember that had some weird business requirements"


Might I recommend writing those weird business requirements down as comments instead of just hoping someone will guess them six months down the line?


So even if comments are flawlessly updated they are not a silver bullet. Not everyone are good at explaining confusing concepts in plain English so worst case you have confusing code and a comment that is 90% accurate but describe one detail in a way that doesn't really match what the code says. This will make you question if you have understood what the code does and it will take time and effort to convince yourself that code is in fact deterministic and unsurprising.

(but most often the comment is is just not updated or updated along with the code but without full understanding, which is what caused the bug that is the reason you are looking at the code in question)


> So even if comments are flawlessly updated they are not a silver bullet.

This "has to be perfect in perpetuity or it is of no value" mentality I don't find helpful.

Be kind to FutureDev. Comment the weird "why"s. If you need to change it later, adjust the comment.


I don't think comments need to be perfect to have value. My point was that if a certain piece of code is solving a particularly confusing problem in the domain, explaining it in a comment doesn't _necessarily_ mean the code will be less confusing to future dev if the current developer is not able to capture the issue in plain English. Future dev would be happier I think with putting more effort into refactoring and making the code more readable and clear. When that fails, a "here be dragons" comment is valuable.


They can write a very long comment explaining why it is confusing them in X, Y, Z vague ways. Or even multilingual comments if they have better writing skills in another lanaguage.

And even if they don’t know themselves why they are confused, they can still describe how they are confused.


And that time spent writing a small paper in one's native language would be better spent trying to make the code speak for itself. Maybe get some help, pair up and tackle the complexity. And when both/all involved is like, we can't make this any clearer and it's still confusing af. _Then_ it's time to write that lengthy comment for future poor maintainers.


You can only do the “what” with clearer code. The “why” needs some documentation. Even if it is obvious what the strange conditionals do, someone needs to have written down that this particular code is there because the special exemption from important tariffs of cigarettes due to the trade agreement between Serbia and Tunis that was valid between the years years 1992 and 2007.


This is where a good comment really can help! And in these types of domains I would guess/hope that there exists some project master list to crossref that will send both developers and domain experts to the same source for "tariff-EU-92-0578" specifically the section 'exemptions'. So the comment is not not just a whole paragraph copied in between a couple of /*/


And any attempt whatsoever is some improvement over doing nothing and wishing luck to the next guy.


Thing is, good documentation has to be part of the company's process. eg, a QA engineer would have to be responsible for checking the documentation and certifying it. Costs money and time.

You can't expect developers, already working 60 hour weeks to meet impossible deadlines, to spend another 15 hours altruistically documenting their code.


Any documentation at all > no documentation, 99 times out of 100. And requiring your people to work 60 hours/week is symptomatic of larger problems.


How about old, out of date documentation that is actively misleading? Because that’s mostly what I run into, and it’s decidedly worse that no documentation.

Give me readable code over crappy documentation any day. In an ideal world the docs would be correct all of the time, apparently I don’t live in that world, and I’ve grown tired of listening to those who claim we just need to try harder.


Every line of documentation is a line of code and is a liability as it will rot if not maintained. That’s why you should be writing self documenting code as much as possible that’s obviates the need for documentation. But unlike code, stale/wrong doc will not break tests.

Spending 15 hours documenting the code is something no leader should be asking of engineering to do. You should not need to do it. Go back and write better code, one That’s more clear at a glance, easily readable, uses small functions written at a comparable level of abstraction, uses clear, semantically meaningful names.

Before you write a line of documentation, you should ask yourself whether the weird thing you were about to document can be expressed directly in the name of the method of the variable instead. Only once you have exhausted all the options for expressing the concept in code, then, only then, are you allowed to add the line of the documentation regarding it.


> Only once you have exhausted all the options for expressing the concept in code, then, only then, are you allowed to add the line of the documentation regarding it.

But that's what people are talking about when talking about comments. The assumption is that the code is organized and named well already.

The real world of complexity is way beyond the expressiveness of code, unless you want function names like:

prorateTheCurrentDaysPartialFactoryReceiptsToYesterdaysStateOfSalesOrderAllocationsInTheSamePrioritySequenceThatDrivesFinancialReportingOfOwnedInventoryByBusinessUnit()

The code that performs this function is relatively simple, but the multiple concepts involved in the WHY and HOW are much less obvious.


Or you know, work the devs 40 hour weeks and make sure documentation is valued. Everything costs one way or another, it's all trade-off turtles all the way down.


Don't let perfect be the enemy of good.

"We don't write any documentation because we can't afford a dedicated QA process to certify it" <- that's dumb.


Yeah: "what if this code becomes tech debt later" applies to everything, not just comments. It's a tradeoff.

The best thing you can do to avoid creating debt for later maintainers is to write code that's easy to delete, and adding comments helps with that.


An outdated comment is still a datapoint! Including if the comment was wrong when it was first written!

We live in a world with version history, repositories with change requests, communications… code comments are a part of that ecosystem.

A comment that is outright incorrect at inception is still valuable even if it is at least an attempt by the writer to describe their internal understanding of things.


This. I have argued with plenty of developers on why comments are useful, and the counter arguments are always the same.

I believe it boils down to a lack of foresight. At some point in time, someone is going to revisit your code, and even just a small `// Sorry this is awful, we have to X but this was difficult because of Y` will go a long way.

While I (try to) have very fluid opinions in all aspects of programming, the usefulness of comments is not something I (think!) I'll ever budge on. :)


> // Sorry this is awful, we have to X but this was difficult because of Y

You don’t know how many times I’ve seen this with a cute little GitLens inline message of “Brian Smith, 10 years ago”. If Brian couldn’t figure it out 10 years ago, I’m not likely going to attempt it either, especially if it has been working for 10 years.


But knowing what Brian was considering at the time is useful, both due avoiding redoing that and for realising that some constraints may have been lifted.


We should call them code clues


What if you don't know that the comment is wrong?


IMO the only thing you can assume is that the person who wrote the comment wasn't actively trying to deceive you. You should treat all documentation, comments, function names, commit messages etc with a healthy dose of scepticism because no one truly has a strong grip on reality.


Right, unlike code (which does what it does, even if that isn't what the writer meant) there's no real feedback loop for comments. Still worth internalizing the info based on that IMO.

"This does X" as a comment when it in fact does Y in condition Z means that the probability you are looking at a bug goes up a bit! Without the comment you might not be able to identify that Y is not intentional.

Maybe Y is intentional! In which case the comment that "this is intentional" is helpful. Perhaps the intentionality is also incorrect, and that's yet another data point!

Fairly rare for there to be negative value in comments.


It just occurred to me that perhaps this is where AI might prove useful. Functions could have some kind of annotation that triggers AI to analyze the function and explain it plain language when you do something like hover over the function name in the IDE, or, you can have a prompt where you can interact with that piece of code and ask it questions. Obviously this would mean developer-written comments would be less likely to make it into the commit history, but it might be better than nothing, especially in older codebases where the original developer(s) are long gone. Maybe this already exists, but I’m too lazy to research that right now.


But then could you trust it not to hallucinate functionality that doesn't exist? Seems as risky as out-of-date comments, if not more

What I'd really like is an AI linter than noticed if you've changed some functionality referenced in a comment without updating that comment. Then, the worst-case scenario is that it doesn't notice, and we're back where we started.


Comments that explain the intent, rather than implementation, are the more useful kind. And when intent doesn't match the actual code, that's a good hint - it might be why the code doesn't work.


If a developer can’t write intelligible comments or straightforward code, then I’d argue they should find another job.


I mean it's easy to say silly things like this, but in reality most developers suck in one way or another.

In addition companies don't seem to give a shit about straightforward code, they want LOC per day and the cheapest price possible which leads to tons of crap code.


Each person has their own strengths, but a worthwhile team member should be able to meet minimum requirements of readability and comments. This can be enforced through team agreements and peer review.

Your second point is really the crux of business in a lot of ways. The balance of quality versus quantity. Cost versus value. Long-term versus short term gains. I’m sure there are situations where ruthlessly prioritizing short term profit through low cost code is indeed the optimal solution. For those of us who love to craft high-quality code, the trick is finding the companies where it is understood and agreed that long-term value from high-quality code is worth the upfront investment and, more importantly, where they have the cash to make that investment.


>I’m sure there are situations where ruthlessly prioritizing short term profit through low cost code is indeed the optimal solution

This is mostly how large publicly traded corps work, unless they are ran by programmers that want great applications or are required by law, they tend to write a lot of crap.


>In addition companies don't seem to give a shit about straightforward code, they want LOC per day and the cheapest price possible which leads to tons of crap code.

Companies don't care about LOC, they care about solving problems. 30 LOC or 30k LOC doesn't matter much MOST of the time. They're just after a solution that puts the problem to rest.


If a delivery company has four different definitions of a customer’s first order, and the resulting code has contents that are hard to parse - does the Blake lie with the developer, or the requirements?


If the developer had time to do it, with him. Otherwise with the company

I'm sure there's some abysmal shit that's extremely hard to properly abstract. Usually the dev just sucks or they didn't have time to make the code not suck


Business requirements deviate from code almost immediately. Serving several clients with customisation adds even more strain on the process. Eventually you want to map paragraphs of business req to code which is not a 1:1 mapping.

Aging codebase and the ongoing operations make it even harder to maintain consistently. eventually people surrender.


Then in 3 months someone in between came changing the code slightly that makes comment obsolete but doesn’t update the comment. Making all worse not better.

Issue trackers are much better because then in git you can find tickets attached to the change.

No ticket explaining why - no code change.

Why not in repo? because business people write tickets not devs. Then tickets are passed to QA who also does read the code but also need that information.


Why did the reviewer approve the change if the developer didn’t update the comment?

It sounds like people are failing at their jobs.


Oh that is one of my pet peeves.

"If only people would do their jobs properly".

So we just fire all the employees and hire better ones only because someone did not pay attention to the comment.

Of course it is an exaggeration - but also in the same line people who think "others are failing at their jobs" - should pick up and do all the work there is to be done and see how long they go until they miss something or make a mistake.

Solution should be systematic to prevent people from failing and not expecting "someone doing their job properly".

Not having comments as something that needs a review reduces workload on everyone involved.

Besides, interfaces for PRs they clearly mark what changed - they don't point what hasn't been changed. So naturally people review what has changed. You still get the context of course and can see couple lines above and below... But still I blame the tool not people.


Requirements should be in the JIRA. JIRA number should be in the commit message.

You do git blame and you see why each line is what it is.

Comments are nice too, but they tend to lie the older they are. Git blame never lies.


A code tends to be reused. When it happens jira is not likely to travel alongside the code. All 'older' jira tickets are useless broken links. All you have in practice is jira name. It usually happen with 'internal documentation' links as well.

Git blame often lies when big merge was squashed. I mostly had these in Perforce so I might be wrong. Also when code travels between source version control servers and different source version control software it also loses information.

I would say in my gamedev practical experience the best comments I saw are TODO implement me and (unit) test code that still runs. First clearly states that you have reached outside of what was planned before and 2nd allows you to inspect what code meant to do.


One of my favorite conventions is ‘TODO(username): some comment’. This lets attribution survive merges and commits and lets you search for all of someone’s comments using a grep.


I tend to do:

  // TODO: <the name of some ticket>: <what needs to happen here>
e.g.

  // TODO: IOS-42: Vogon construction fleet will need names to be added to this poetry reading room struct
I've not felt my name is all that important for a TODO, as the ticket itself may be taken up by someone else… AFAICT they never have been, but they could have been.


Jira entries get wiped arbitrarily. Git blame may not lie, but it doesn't survive larger organizational "refactoring" around team or company mergers. Or refactoring code out into separate project/library. Hell, often enough it doesn't survive commits that rename bunch of files and move other stuff around.


Comments are decent but flawed. Being a type proponent I think the best strategy is lifting business requirements into the type system, encoding the invariants in a way that the compiler can check.


Comments should describe what the type system can't. Connect, pitfalls, workarounds for bugs in other code, etc.


Thank god we’re held to such low standards. Every time I’ve worked in a field like pharmaceuticals or manufacturing, the documentation burden felt overwhelming by comparison and a shrug six months later would never fly.


We are not engineers. We are craftsmen, instead of working with wood, we work with code. What most customers want is an equivalent of "I need a chair, it should look roughly like this."

If they want blueprints and documentation (e.g. maximum possible load and other limits), we can supply (and do supply, e.g. in pharma or medicine), but it will cost them quite a lot more. By the order of magnitude. Most customers prefer cobbled up solution that is cheap and works. That's on them.

Edit: It is called waterfall. There is nothing inherently wrong with it, except customers didn't like the time it took to implement a change. And they want changes all the time.


> We are not engineers. We are craftsmen

Same difference. Both appellations invoke some sort of idealized professional standards and the conversation is about failing these standards not upholding them. We're clearly very short of deserving a title that carries any sort of professional pride in it. We are making a huge mess of the world building systems that hijack attention for profit and generate numerous opportunities for bad agents in the form of security shortfalls or opportunities to exploit people using machines and code.

If we had any sort of pride of craft or professional standards we wouldn't be pumping out the bug ridden mess that software's become and trying to figure out why in this conversation.


That is quite a cynical take. A lot of us take pride in our work and actively avoid companies that produce software that is detrimental to society.


It is cynical but it is also a generalization better supported by the evidence than "we're craftsmen" or "we're engineers".

If you can say "I'm a craftsman" or "I'm an engineer" all the power to you. Sadly I don't think we can say that in the collective form.


> If you can say "I'm a craftsman" or "I'm an engineer" all the power to you. Sadly I don't think we can say that in the collective form.

My cynicism of the software "profession" is entirely a function of experience, and these titles are the (very rare) exception.

The norm is low-quality, low complexity disposable code.


Hmm, thinking back, think most companies I worked (from the small to the very large tech companies) had on average pretty good code and automated tests, pretty good processes, pretty good cultures and pretty good architectures. Some were very weak with one aspect, but made up for it others. But maybe I got lucky?


> Both appellations invoke some sort of idealized professional standards

The key point of the comment was that engineers do have standards, both from professional bodies and often legislative ones. Craftsmen do not have such standards (most of them, at least where I am from). Joiners definitely don't.

Edit: I would also disagree with "pumping out bug ridden mess that software's become."

We are miles ahead in security of any other industry. Physical locks have been broken for decades and nobody cares. Windows are breakable by a rock or a hammer and nobody cares.

In terms of bugs, that is extraordinary low as well. In pretty much any other industry, it would be considered a user error, e.g. do not put mud as a detergent into the washing machine.

Whole process is getting better each year. Version control wasn't common in 2000s (I think Linux didn't use version control until 2002). CI/CD. Security analyzers. Memory managed/safe languages. Automatic testing. Refactoring tools.

We somehow make hundreds of millions of lines of code work together. I seriously doubt there is any industry that can do that at our price point.


> We are miles ahead in security of any other industry. Physical locks have been broken for decades and nobody cares. Windows are breakable by a rock or a hammer and nobody cares.

That is not such a great analogy, in my opinion. If burglars could remotely break into many houses in parallel while being mostly non-trackable and staying in the safety of their own home, things would look differently on the doors and windows front.


The reason why car keys are using chips is because physical safety sucks so much in comparison with digital.

The fact is we are better at it because of failure of state to establish the safe environment. Generally protection and safe environment is one of reason for paying taxes.


> The reason why car keys are using chips is because physical safety sucks so much in comparison with digital.

Not the reason. There is no safe lock, chip or not. You can only make it more inconvenient then the next car to break in.

> The fact is we are better at it because of failure of state to establish the safe environment. Generally protection and safe environment is one of reason for paying taxes.

Exactly backwards. The only real safety is being in a hi-sec zone protected by social convention and State retribution. The best existing lock in a place where bad actors have latitude won't protect you, and in a safe space you barely need locks at all.


OTOH, the level of documentation you get for free from source control would be a godsend in other contexts: the majority of the documentation you see in other processes is just to get an idea of what changed when and why.


there is difference between building a dashboard for internal systems and tech that if failed can kill people


Most software work in pharma and manufacturing is still CRUD, they just have cultures of rigorous documentation that permeates the industry even when it's low value. Documenting every little change made sense when I was programming the robotics for a genetic diagnostics pipeline, not so much when I had to write a one pager justifying a one line fix to the parser for the configuration format or updating some LIMS dependency to fix a vulnerability in an internal tool that's not even open to the internet.


Well, a hand watch or a chair cannot kill people, but the manufacturing documentation for them will be very precise.

Software development is not engineering because it is still relatively young and immature field. There is a joke where a mathematician, a physicist and a engineer are given a little red rubber ball and asked to find its volume. The mathematician measures the diameter and computes, the physicist immerses the ball into water and sees how much was displaced, and an the engineer looks it up in his "Little red rubber balls" reference.

Software development does not yet have anything that may even potentially grow into such a reference. If we decide to write it we would not even know where to start. We have mathematicians who write computer science papers; or physicists who test programs; standup comedians, philosophers, everyone. But not engineers.


Difference is that code is the documentation and design.

That is problem where people don’t understand that point.

Runtime and running application is the chair. Code is design how to make “chair” run on computer.

I say in software development we are years ahead when it comes to handling complexity of documentation with GIT and CI/CD practices, code reviews and QA coverage with unit testing of the designs and general testing.

So I do not agree that software development is immature field. There are immature projects and companies cut corners much more than on physical products because it is much easier to fix software later.

But in terms of practices we are way ahead.


Isn’t this similar to saying the valves and vessels of a chemical processing system is the design and documentation of the overall process?

I know that it’s frequently reposted but Peter Naur’s Programming as Theory Building is always worth a reread.

The code doesn’t tell us why decisions were made, what constraints were considered or what things were ruled out


The word code comes from Latin coudex which seems mean - to hack a tree. Are we then not mere lumberjacks with the beards and beer and all :)))


> Oh, and in 6 months the unintuitive and confusing concept needs to be completely changed into - surprise, surprise - a completely different but equally unintuitive and confusing concept.

But you have to keep the old way of working exactly the same, and the data can't change, but also needs to work in the new version as well. Actually show someone there's two modes, and offer to migrate their data to version 2? No way - that's confusing! Show different UI in different areas with the same data that behaves differently based on ... undisclosed-to-the-user criteria. That will be far less confusing.


As a user 'intuitive' UIs that hide a bunch of undisclosed but relevant complexity send me into a frothing rage.


In many problem spaces, software developers are only happy with interfaces made for software developers. This article diving into the layers of complex logic we can reason about at once perfectly demonstrates why. Developers ‘get’ that complexity, because it’s our job, and think about GUIs as thin convenience wrappers for the program underneath. To most users, the GUI is the software, and they consider applications like appliances for solving specific problems. You aren’t using the refrigerator, you’re getting food. You’re cooking, not using the stove. The fewer things they have to do or think about to solve their problem to their satisfaction, the better. They don’t give a flying fuck about how software does something, probably wouldn’t bother figuring out how to adjust it if they could, and the longer it takes them to figure out how to apply their existing mental models UI idioms to the screen they’re looking at, the more frustrated they get. Software developers know what’s going on behind the scenes so seeing all of the controls and adjustments and statuses and data helps developers orient themselves save figure out what they’re doing. Seeing all that stuff is often a huge hindrance to users that just have a problem they need to solve, and have a much more limited set of mental models and usage idioms they need to use figuring how which of those buttons to press and parameters to adjust. That’s the primary reason FOSS has so few non-technical users.

The problem comes in when people that aren’t UI designers want to make something “look designed” so they start ripping stuff out and moving it around without understanding how it works affect different types of users. I don’t hear too many developers complain about the interface for iMessage for example despite having a fraction of the controls visible at any given time, because it effectively solves their problem, and does so easier than with a visible toggle for read receipts, SMS/iMessages, text size, etc etc etc. It doesn’t merely look designed, it it’s designed for optimal usability.

Developers often see an interface that doesn’t work well for developers usage style, assume that means it doesn’t work well, and then complain about it among other developers creating an echo chamber. Developers being frustrated with an interface is an important data point that shouldn’t be ignored, but our perspectives and preferences aren’t nearly as generalizable some might think.


I'm not particularly bothered by non-developer UI. I'm bothered by the incessant application of mobile UI idioms to desktop programs (remember when all windows programs looked somewhat similar?), by UI churn with no purpose, by software that puts functionality five clicks deep for no reason other than to keep the ui 'minimal', by the use of unclear icons when there's room for text (worse, when it's one of the bare handful of things with a universally-understood icon and they decided to invent their own), by UIs that just plain don't present important information for fear of making things 'busy'. There's a lot to get mad about when it comes to modern UIs without needing to approach it from a software developer usage style perspective.


You're making a lot of assumptions about who's doing what, what problems they're trying to solve by doing it, and why. The discipline of UI design is figuring out how people can solve their problems easily and effectively. If you have advanced users that need to make five mouse clicks to perform an essential function, that's a bad design and the chance of that being a UI design decision is just about zero. Same thing with icons. UI design, fundamentally, is a medium of communication: do you think it's more likely a UI designer-- a professional and likely educated interactivity communicator-- chose those icons, or a developer or project manager grabbing a sexy looking UI mockup on dribble and trying to smash their use case into it?

Minimalism isn't a goal-- it's a tool to make a better interface and can easily be overused. The people that think minimalism is a goal and will chop out essential features to make something "look designed" are almost always developers. Same thing with unclear icons. As someone with a design degree that's done UI design but worked as a back-end developer for a decade before that, and worked as a UNIX admin off and on for a decade before that, I am very familiar with the technical perspective on design and it's various echo-chamber-reinforced follies.

It's not like all UI designers are incredibly qualified or don't underestimate the importance of some particular function within some subset of users, and some people that hire designers don't realize that a graphic designer isn't a UI designer and shouldn't be expected to work as one. But 700 times out of 1000, that's something dev said "this is too annoying to implement" or some project manager dropped it from the timeline. Maybe 250 of those remaining times, the project manager says "we don't need designers for this next set of features, right? Dev can just make it look like the other parts of the project?"

Developers read an edward tufte book, think they're experts, and come up with all sorts of folk explanations about what's happening with a design and why people are doing it, then talk about it in venues like this with a million other developers agreeing with them. That does a whole lot more damage to UIs in the wild than bad design decisions made by designers.


You seem to think I'm attacking UI designers. I'm not. I think software would be a lot better with professional UI designers designing UIs.

edit: I am making a lot of assumptions. I'm assuming that most UIs aren't really designed, or are 'designed' from above with directions that are primarily concerned about aesthetics.


+1 to all this. And when did it become cool to have icons that provide no feedback they've been clicked, combined with no loading state? I'm always clicking stuff twice now because I'm not sure I even clicked it the first time.


I think a lot of this is bike shedding. Changing the interface design is easy. Understanding usability and building usable systems is hard.


> That’s the primary reason FOSS has so few non-technical users.

Yeah, citation needed. If your argument that 'non-technical users' (whatever that is - being technical is not restricted to understanding computers and software deeply) don't use software that exposes a lot of data on its internals as exemplified by FOSS having few 'non-technical users' meaning people who are not software developers, this is just false. There are entire fields where FOSS software is huge. GIS comes to mind.


Normally in this rant I specifically note that non-software technical people are still technical. For genuinely non-technical software, what are the most popular end-user facing FOSS-developed applications? Firefox, signal, blender, Inkscape, Krita maybe… most of those are backed by foundations that pay designers and in Mozilla’s case, actually do a ton of open usability research. I don’t believe Inkscape does but they do put a ton of effort into thinking about things from the user workflow perspective and definitely do not present all of the functionality to the user all at once. Blender, at first, just made memorize a shitload of shortcuts but they’ve done a ton of work figuring out what users need to see in which tasks in different workflows and have a ton of different purpose-built views. For decades, Gimp treated design, workflow and UI changes like any other feature and they ended up with a cobbled-together ham fisted interface used almost exclusively by developers. You’ll have a hard time finding a professional photographer that hasn’t tried gimp and an even harder time finding one that still uses it because of the confusing, unfocused interface. When mastodon stood a real chance of being what Bluesky is becoming, I was jumping up and down flailing my arms trying to get people to work on polishing the user flow and figure out how to communicate what they needed to know concisely. Dismissal dismissal dismissal. “I taught my grandmother how federation works! They just need to read the documentation! Once they start using it they’ll figure it out!” Well, they started using it, didn’t have that gifted grandmother-teaching developer to explain it to them, and they almost all left immediately afterwards.

Just like human factors engineering, UI design is a unique discipline that many in the engineering field think they can intuit their way through. They’re wrong and if you look beyond technical people, it’s completely obvious.


Worth noting that Gimp just made a separate UI design repo and seem to be doing a great job at confronting this systemic problem in the project.


I'm trying to learn acceptance: how not to get so angry at despicable UIs.

Although I admit I'm kinda failing. My minor successes have been by avoiding software: e.g. giving up programming (broken tools and broken targets were a major frustration) and getting rid of Windows.


Having given up programming, what do you do now?


IMO the fact that code tends to become hard over time in the real world, is even more reason to lower cognitive load. Because cognitive load is related to complexity. Things like inheritance make it far too easy to end up with spaghetti. So if it's not providing significant benefit, god damn don't do it in the first place (like the article mentions).


That depends on who thinks it's going to be a significant benefit - far far too many times I've had non-technical product managers yelling about some patch or feature or whatever with a "just get it done" attitude. Couple that with some junior engineering manager unwilling to push back, with an equally junior dev team and you'll end up with the nasty spaghetti code that only grows.


Sounds like a bunch of excellent excuses why code is not typically well factored. But that all just seems to make it more evident that the ideal format should be more well-factored.


>It's quite easy to imagine a well factored codebase where all things are neatly separated.

If one is always implementing new code bases that they keep well factored, they should count their blessings. I think being informed about cognitive load in code bases is still very important for all the times we aren't so blessed. I've inherited applications that use global scope and it is a nightmare to reason though. Where possible I improve it and reduce global scope, but that is not always an option and is only possible after I have reasoned enough about the global scope to feel I can isolate it. As such, letting others know of the costs is helpful to both reduce it from happening and to convince stakeholders of the importance of fixing it after it has happened and accounting for the extra costs it causes until it is fixed.

>The messy stuff is where the real world concepts need to be transformed into code.

I also agree this can be a messy place, and on a new project, it is messy even when the code is clean because there is effectively a business logic/process code base you are inheriting and turning into an application. I think many of the lessons carry over well as I have seen an issue with global scope in business processes that cause many of the same issues as in code bases. When very different business processes end up converging into one before splitting again, there is often extra cognitive load created in trying to combine them. A single instance really isn't bad, much like how a single global variable isn't bad, but this is an anti-pattern that is used over and over again.

One helpful tool is working ones way up to the point of having enough political power and earned enough respect for their designs to have suggestions of refactoring business processes be taken into serious consideration (one also has to have enough business acumen to know when such a suggestion is reasonable).

>the original author doesn't work here anymore so no one's here to explain the original code's intent.

I fight for comments that tell me why a certain decision is made in the code. The code tells me what it is doing, and domain knowledge will tell most of why it is doing the things expected, but anytime the code deviates from doing what one would normally expect to be done in the domain, telling me why it deviated from expected behavior is very important for when someone is back here reading it 5+ years later when no one is left from the original project. Some will suggest putting it in documentation, but I find that the only documentation with any chance of being maintained or even kept is the documentation built into the code.


The "why" is the hardest part. You are writing to a future version of most probably a different person with a different background. Writing all is as wrong as writing nothing. You have to anticipate the questions of the future. That takes experience and having been in different shoes, "on the receiving side" of such a comment. Typically developers brag what they did, not why, especially the ones who think they are good...


> Where just the concepts need to be whiteboarded and explained because they're unintuitive and confusing.

they're intuitive to somebody - just not the software engineer. This simply means there's some domain expertise which isn't available to the engineer.


Not necessarily. There are a lot of domains where you're digitizing decades of cobbled together non-computer systems, such as law, administration, or accounting. There's a very good chance that no single human understands those systems either, and that trying to model them will inevitably end up with obscure code that no one will ever understand either. Especially as legislation and accounting practices accrete in the future, with special cases for every single decision.


Plus to everything said. It's an everyday life of "maintainer", picking the next battle to pick the best way to avoid sinking deeper and defending the story that exactly "this" is the next refactoring project. All that while balancing different factors as you mention to actually believe oneself, because there are countless of paths..


Oh, and there's massive use of aspect-oriented programming, the least local paradigm ever!


I have never actually seen aspect-oriented programming used in the wild. Out of curiosity, in what context are you seeing AOP used?


We use it to automatically instrument code for tracing. Stuff like this is IMO the only acceptable use to reduce boiler-plate but quickly becomes terrible if you don't pay attention.


Also good for having default activities performed on object or subsystem. For instance, by default, always having an object have security checks to make sure it has permission to perform the tasks it should be (have seen this, and sounds like a good idea at least). And also, to have some basic logging performed to show when you've entered and left function calls. It's easy to forget to add these to a function, especially with large codebase with lots of developers


This puts things really well. I’ll add into it that between the first white boarding session and the first working MVP there’ll be plenty of stakeholders who change their mind, find new info, or ask for updates that may break the original plan


It can be done. Sometimes.

I am so proud and happy, when I can make a seemingly complicated change quickly, because the architecture was well designed and everthing neatly seperated.

Most of the time though, it is exactly like you described. Or randalls good code comic:

https://xkcd.com/844/

Allmost too painful to be funny, when you know the pain is avoidable in theory.

Still, it should not be an excuse to be lazy and just write bad code by default. Developing the habit of making everything as clean, structured and clear as possible allways pays of. Especially if that code, that was supposed to be a quick and dirty throw away code experiment somehow ended up being used and 2 years later you suddenly need to debug it. (I just experienced that joy)


Nothing about computers is intuitive. Not even using a mouse.

A late-breaking change is a business advantage—-learn how to roll with them.


In my experience, the more convoluted code is more likely to have performance issues.


I mean really nobody wants an app that is slow, hard to refactor, with confusing business logic etc. Everyone wants good proporties.

So then you get into what you’re good at. Maybe you’re good at modeling business logic (even confusing ones!). Maybe you’re good at writing code that is easy to refactor.

Maybe you’re good at getting stuff right the first time. Maybe you’re good at quickly fixing issues.

You can lean into what you’re good at to get the most bang for your buck. But you probably still have some sort of minimum standards for the whole thing. Just gotta decide what that looks like.


Some people are proud of making complex code. And too many people admire those who write complex code.


> you also need to write it in a convoluted way because, for various annoying reasons, that's what performs best on the computer.

That's nothing to do with hardware. The various annoying reasons are not set in stone or laws of physics. They are merely the path dependency of decades of prioritizing shipping soon because money.


> If someone feels compelled to read every function either the functions are poor abstractions or the reader has trust issues, which may be warranted.

I joined a company with great code and architecture for 3 months last year. They deal with remittances and payments.

Their architecture leads are very clued up, and I observed that they spent a lot of quality time figuring out their architecture and improvements, continuously. They'd do a lot of refactors for all the various teams, and the cadence of feature development and release was quite impressive.

In that period though, I and another long-standing colleague made a few errors that cost the company a lot of money, like an automated system duplicating payments to users for a few hours until we noticed it.

Part of their architectural decision was to use small functions to encapsulate logic, and great care and code review was put into naming functions appropriately (though they were comment averse).

The mistakes we committed, were because we trusted that those functions did what they said they did correctly. After all, they've also been unit tested, and there's also integration tests.

If it weren't for the fortitude of the project manager (great guy hey) in firmly believing in collective responsibility if there's no malice, I'd probably have been fired after a few weeks (I left for a higher offer elsewhere).

---

So the part about trust issues resonates well with me. As a team we made the decision that we shouldn't always trust existing code, and the weeks thereafter had much higher cognitive load.


That sounds like a very difficult situation. Would you be willing to elaborate on what kinds of bugs lay in the pre-existing functions? Was some sort of operation that was supposed to be idempotent (“if you call it with these unique parameters over and over, it will be the same as if you only called it once”) not so? I am trying to imagine what went wrong here. A tough situation, must have been quite painful. How serious were the consequences? If you don’t feel comfortable answering that is okay.


I can't remember the exact detail, but one instance was a function checking whether a user should be paid based on some conditions. It checked the db, and I think because the codebase and db move fast, there was a new enum added a few months prior which was triggered by our transaction type.

So that helped function didn't account for the new enum, and we ended up sending >2 payments to users, in some cases I think over 10 to one user.

The issue was brought to customer support's attention, else we might have only noticed it at the end of the week, which I think would have led to severe consequences.

The consequences never reached us because our PM dealt with them. I suppose in all the financial loss instances, the business absorbed the losses.


> So that helped function didn't account for the new enum

This is where Scala/Rust's enforcement of having to handle all arms of a match clause help catch such issues - if you are matching against the enum, you won't even be able to compile if you don't handle all the arms.


Sounds like the source of truth for the enum members may have been in the database.

(But yes, exhaustiveness checking for sum types is a great feature.)


The only db work I've done in rust required a recompile if the db schema changed, or even the specific queries your program used, because the rust types got generated from the schema. So in those cases the db change would have driven a rust type change and rust would have verified exhaustive handling.


Db changes are generally at runtime, how would you recompile rust code during the save of the data to the db? How do you rollback the change if a compile fails? How do you add the necessary code to handle new cases of the enum but not have it present in the db? This is amazingly interesting to me, would love to know more.


Maybe a code gen layer that generates rust types from a db schema. I don’t know rust but have seen those in other languages. I could see a DB enum type corresponding to a language specific enum type and then the language rules applying.

I do think this is a level of indirection myself; if the generated code was perfect and always in sync, that would be one thing, but by definition it is not the case.


Function names aren't wholly distinct from comments. They suffer from the same problems as comments - they can go stale and no longer reflect the code they're naming.


I think the argument against comments is that while function names are a necessary form of communicating intent of the code, comments aren’t. The more forms there are the more work there is to update on each change in the code. Comments mean more to update and hence more to fail to update. They also generally can’t be detected for staleness as well as functions, although that is changing now with better ai, not only compilers etc.


Functions generally need to be documented, especially if there are any gotchas not obvious from the function signature. And one should always read the documentation. Good names are for discovery and recollection, and for the call-site code to be more intelligible, but they don’t replace having a specification of the function’s interface contract, and client code properly taking it into account.


>The mistakes we committed, were because we trusted that those functions did what they said they did correctly. After all, they've also been unit tested, and there's also integration tests.

As it is stated, I don't see where it is your mistake. You should be able to trust things do what they say, and there should be integration testing that happens which adds the appropriate amount of distrust and verification. Even with adequate unit testing, you normally inject the dependencies so it wouldn't be caught.

This seems an issue caused by two problems, inadequate integration testing and bugs in the original function, neither of which are your fault.

Building a sixth sense of when to distrust certain code is something you see from more experienced developers at a company, but you were new so there is no reason to expect you to have it (and the system for making code changes shouldn't depend upon such intuition anyways).


> This seems an issue caused by two problems, inadequate integration testing and bugs in the original function, neither of which are your fault.

I believe the problem may be the culture. Business logic that handles sensitive things where bugs can cost a lot of money is one place where hardcore DRY and factoring everything into small functions is not such a great idea. Yes, it may be a big function, and there is an upfront overhead of having to understand it all to make a change, and there is some duplication of code, but once you understand the function you can reason locally and such bugs would be less likely.


> I've been thinking about the notion of "reasoning locally" recently. Enabling local reasoning is the only way to scale software development past some number of lines or complexity. When reasoning locally, one only needs to understand a small subset, hundreds of lines, to safely make changes in programs comprising millions.

That was supposedly the main trait of object-oriented programming. Personally that was how it was taught to me: the whole point of encapsulation and information hiding is to ensure developers can "reason locally", and thus be able to develop more complex projects by containing complexity to specific units of execution.

Half of SOLID principles also push for that. The main benefit of Liskov's substitution principle is ensure developers don't need to dig into each and every concrete implementation to be able to reason locally about the code.

On top of that, there are a multitude of principles and rules of thumb that also enforce that trait. For example, declaring variables right before they are used the first time. Don't Repeat Yourself to avoid parsing multiple implementations of the same routine. Write Everything Twice to avoid premature abstractions and tightly coupling units of execution that are actually completely independent, etc etc etc.

Heck, even modularity, layered software architectures, and even microservices are used to allow developers to reason locally.

In fact, is there any software engineering principle that isn't pushing for limiting complexity and allowing developers to reason locally?


> In fact, is there any software engineering principle that isn't pushing for limiting complexity and allowing developers to reason locally?

Both DRY and SOLID lead to codebases that can be worse in this respect.

DRY and SRP limit what will be done in a single method or class, meaning that both the logic will eventually be strewn across the codebase, as well as any changes to that will need to take all of the pieces using the extracted logic into account. Sometimes it makes sense to have something like common services, helper and utility classes, but those can be in direct opposition to local reasoning for any non-trivial logic.

Same for polymorphism and inheritance in general, where you suddenly have to consider a whole class structure (and any logic that might be buried in there) vs the immediate bits of code that you’re working with.

Those might be considered decent enough practices to at least consider, but in practice they will lead to a lot of jumping around the codebase, same for any levels of abstraction (resource/controller, service, mappers, Dto/repository, …) and design patterns.


Yeah I think that, though experienced programmers tend to understand what makes code good, they're often bad at expressing it, so they end up making simplified and misleading "rules" like SRP. Some rules are better than others, but there's no substitute for reading a lot of code and learning to recognize legibility.


> Yeah I think that, though experienced programmers tend to understand what makes code good, they're often bad at expressing it, so they end up making simplified and misleading "rules" like SRP.

I mean, I'm not saying that those approaches are always wholly bad from an organizational standpoint either, just that there are tradeoffs and whatnot.

> Some rules are better than others, but there's no substitute for reading a lot of code and learning to recognize legibility.

This feels very true though!


Encapsulation is the good part of object-oriented programming for precisely this reason, and most serious software development relies heavily on encapsulation. What's bad about OOP is inheritance.

Microservices (in the sense of small services) are interesting because they are good at providing independent failure domains, but add the complexity of network calls to what would otherwise be a simple function call. I think the correct size of service is the largest you can get away with that fits into your available hardware and doesn't compromise on resilience. Within a service, use things like encapsulation.


Inheritance is everyone's favorite whipping boy, but I've still never been in a codebase and felt like the existing inheritance was seriously hindering my ability to reason about it or contribute to it, and I find it productive to use on my own. It makes intuitive sense and aids understanding and modularity/code resuse when used appropriately. Even really deep inheritance hierarchies where reasonable have never bothered me. I've been in the industry for at least 8 years and a volunteer for longer than that, and I'm currently in a role where I'm one of the most trusted "architects" on the team, so I feel like I should "get it" by now if it's really that bad. I understand the arguments against inheritance in the abstract but I simply can't bring myself to agree or even really empathize with them. Honestly, I find the whole anti-inheritance zeitgeist as silly and impotent as the movement to replace pi with tau, it's simply a non-issue that's unlikely to be on your mind if you're actually getting work done IMHO.


The problem of inheritance is that it should be an internal mechanism of code reuse, yet it is made public in a declarative form that implies a single pattern of such reuse. It works more or less but it also regularly runs into limitations imposed by that declarativeness.

For example, assume I want to write emulators for old computer architectures. Clearly there will be lots of places where I will be able to reuse the same code in different virtual CPUs. But can I somehow express all these patterns of reuse with inheritance? Will it be clearer to invent some generic CPU traits and make a specific CPU to inherit several such traits? It sounds very unlikely. It probably will be much simpler to just extract common code into subroutines and call them as necessary without trying to build a hierarchy of classes.

Or lets take, for example, search trees. Assume I want to have a library of such trees for research or pedagogic purposes. There are lots of mechanisms: AVL trees, 2-3, 2-3-4, red-black, B-Trees and so on. Again there will be places where I can reuse the same code for different trees. But can I really express all this as a neat hierarchy of tree classes?


> The problem of inheritance is that it should be an internal mechanism of code reuse, yet it is made public in a declarative form that implies a single pattern of such reuse.

Not quite. A simplistic take on inheritance suggests reusing implementations provided by a base class, but that's not what inheritance means.

Inheritance sets a reusable interface. That's it. Concrete implementations provided by a base class, by design, are only optional. Take a look at the most basic is-a examples from intro to OO.

Is the point of those examples reusing code, or complying with Liskov's substitution principle?

The rest of your comment builds upon this misconception, and thus is falsified.


Polymorphism is not related to inheritance. In earlier object-oriented systems it was (and is), but only because they were trying all directions. It actually becomes clearer without inheritance and many modern systems introduce it as a separate concept of an interface.

For example, I am sending requests to an HTTP server. There are several authentication methods but when we look at request/method interaction they are similar. So it would be convenient to have standard interface here, something like 'auth.applyTo(request)'. Yet would it be a good idea to try making different 'Auth' methods to be subclasses of each other?

Or another example I'm currently working on: I have a typical search tree, say, AVL, but in my case I need to make references to cells in the tree because I will access it bottom-up. As the tree changes its geometry the data move between cells so I need to notify the data about the address change. This is simple: I merely provide a callback and the tree calls it with each new and changed cell address. I can store any object as long as it provides this callback interface. Does this mean I need to make all objects I am going to store in a tree to inherit some "TreeNotifiable" trait?

Polymorphism happens when we split a system into two components and plan interaction between them. Internals of a component do not matter, only the surface. Inheritance, on the other hand, is a way to share some common behavior of two components, so here the internals do matter. These are really two different concepts.


My example complied perfectly with Liskov's substitution principle. Much better than examples like "a JSON parser is a parser". The system I worked on had perfect semantic subtyping.

Liskov substitution won't save you, and I'm quite tired of people saying it will. The problem of spaghetti structures is fundamental to what makes inheritance distinct from other kinds of polymorphism.

Just say no to inheritance.


> [...] it's simply a non-issue that's unlikely to be on your mind if you're actually getting work done IMHO.

Part of why I get (more) work done is that I don't bother with the near-useless taxonomical exercises that inheritance invites, and I understand that there are ways of writing functions for "all of these things, but no others" that are simpler to understand, maintain and implement.

The amount of times you actually need an open set of things (i.e. what you get with inheritance) is so laughably low it's a wonder inheritance ever became a thing. A closed set is way more likely to be what you want and is trivially represented as a tagged union. It just so happens that C++ (and Java) historically has had absolutely awful support for tagged unions so people have made do with inheritance even though it doesn't do the right thing. Some people have then taken this to mean that's what they ought to be using.

> I've been in the industry for at least 8 years and a volunteer for longer than that, and I'm currently in a role where I'm one of the most trusted "architects" on the team, so I feel like I should "get it" by now if it's really that bad.

I don't think that's really how it works. There are plenty of people who have tons of work experience but they've got bad ideas and are bad at what they do. You don't automatically just gain wisdom and there are lots of scenarios where you end up reinforcing bad ideas, behavior and habits. It's also very easy to get caught up in a collective of poorly thought out ideas in aggregate: Most of modern C++ is a great example of the kind of thinking that will absolutely drag maintainability, readability and performance down, but most of the ideas can absolutely sound good on their own, especially if you don't consider the type of architecture they'll cause.


The difference between inheritance and composition as tools for code reuse is that, in composition, the interface across which the reused code is accessed is strictly defined and explicit. In inheritance it is weakly defined and implicit; subclasses are tightly coupled to their parents, and the resulting code is not modular.


So you've never worked on a code base with a 3-level+ deep inheritance tree and classes accessing their grandparent's protected member variables and violating every single invariant possible?


> 3-level+ deep inheritance tree and classes accessing their grandparent's protected member variables

Yes, I have. Per MSDN, a protected member is accessible within its class and by derived class instances - that's the point. Works fine in the game I work on.

> violating every single invariant possible

Sure, sometimes, but I see that happen without class inheritance just as often.


If you are reading a deep * wide inheritance hierarchy with override methods. You will have to navigate through several files to understand where the overrides occurred. Basically multiply the number of potential implementations by inheritance depth * inheritance width.

You may not be bitten by such an issue in application code. But I've seen it in library code. Particularly from Google, AWS, various Auth libraries, etc. Due to having to interop with multiple apis or configuration.


I'm glad it's been useful to you!

I can only share my own experience here. I'm thinking of a very specific ~20k LoC part of a large developer infrastructure service. This was really interesting because it was:

* inherently complex: with a number of state manipulation algorithms, ranging from "call this series of external services" to "carefully written mutable DFS variant with rigorous error handling and worst-case bounds analysis".

* quite polymorphic by necessity, with several backends and even more frontends

* (edit: added because it's important) a textbook case of where inheritance should work: not artificial or forced at all, perfect Liskov is-a substitution

* very thick interfaces involved: a number of different options and arguments that weren't possible to simplify, and several calls back and forth between components

* changing quite often as needs changed, at least 3-4 times a week and often much more

* and like a lot of dev infrastructure, absolutely critical: unimaginable to have the rest of engineering function without it

A number of developers contributed to this part of the code, from many different teams and at all experience levels.

This is a perfect storm for code that is going to get messy, unless strict discipline is enforced. I think situations like these are a good stress test for development "paradigms".

With polymorphic inheritance, over time, a spaghetti structure developed. Parent functions started calling child functions, and child functions started calling parent ones, based on whatever was convenient in the moment. Some functions were designed to be overridden and some were not. Any kind of documentation about code contracts would quickly fall out of date. As this got worse, refactoring became basically impossible over time. Every change became harder and harder to make. I tried my best to improve the code, but spent so much time just trying to understand which way the calls were supposed to go.

This experience radicalized me against class-based inheritance. It felt that the easy path, the series of local decisions individual developers made to get their jobs done, led to code that was incredibly difficult to understand -- global deterioration. Each individual parent-to-child and child-to-parent call made sense in the moment, but the cumulative effect was a maintenance nightmare.

One of the reasons I like Rust is that trait/typeclass-based polymorphism makes this much less of a problem. The contracts between components are quite clear since they're mediated by traits. Rather than relying on inheritance for polymorphism, you write code that's generic over a trait. You cannot easily make upcalls from the trait impl to the parent -- you must go through a API designed for this (say, a context argument provided to you). Some changes that are easy to do with an inheritance model become harder with traits, but that's fine -- code evolving towards a series of messy interleaved callbacks is bad, and making you do a refactor now is better in the long run. It is possible to write spaghetti code if you push really hard (mixing required and provided methods) but the easy path is to refactor the code.

(I think more restricted forms of inheritance might work, particularly ones that make upcalls difficult to do -- but only if tooling firmly enforces discipline. As it stands though, class-based inheritance just has too many degrees of freedom to work well under sustained pressure. I think more restricted kinds of polymorphism work better.)


> This experience radicalized me against ...

My problem with OO bashing is not that it isn't deserved but seems in denial about pathological abstraction in other paradigms.

Functional programming quickly goes up it's own bum with ever more subtle function composition, functor this, monoidal that, effect systems. I see the invention of inheritance type layering just in adhoc lazy evaluated doom pyramids.

Rich type systems spiral into astronautics. I can barely find the code in some defacto standard crates instead it's deeply nested generics... generic traits that take generic traits implemented by generic structs called by generic functions. It's an alphabet soup of S, V, F, E. Is that Q about error handling, or an execution model or data types? Who knows! Only the intrepid soul that chases the tail of every magic letter can tell you.

I wish there were a panacea but I just see human horrors whether in dynamically-typed monkey-patch chaos or the trendiest esoterica. Hell I've seen a clean-room invention of OO in an ancient Fortran codebase by an elderly academic unaware it was a thing. He was very excited to talk about his phylogenetic tree, it's species and shared genes.

The layering the author gives as "bad OO" admin/user/guest/base will exist in the other styles with pros/cons. At least the OO separates each auth level and shows the relationship between them which can be a blessed relief compared to whatever impenetrable soup someone will cook up in another style.


The difference, I think, is that much of that is not the easy path. Being able to make parent-child-parent-child calls is the thing that distinguishes inheritance from other kinds of polymorphism, and it leads to really bad code. No other kind of polymorphism has this upcall-downcall-upcall-downcall pattern baked into its structure.

The case I'm talking about is a perfect fit for inheritance. If not there, then where?


Encapsulation arguably isn’t a good part, either. It encourages complex state and as a result makes testing difficult. I feel like stateless or low-state has won out.


Encapsulation can be done even in Haskell which avoids mutable state by using modules that don't export their internals, smart constructors etc. instead. You can e.g. encapsulate the logic for dealing with redis in a module and never expose the underlying connection logic to the rest of the codebase.


Hmm, to me encapsulation means a scheme where the set of valid states is a subset of all representable states. It's kind of a weakening of "making invalid states unrepresentable", but is often more practical.

Not all strings are valid identifiers, for example, it's hard to represent "the set of all valid identifiers" directly into the type system. So encapsulation is a good way to ensure that a particular identifier you're working with is valid -- helping scale local reasoning (code to validate identifiers) up into global correctness.

This is a pretty FP and/or Rust way to look at things, but I think it's the essence of what makes encapsulation valuable.


What you’re talking about is good design but has nothing to do with encapsulation. From Wikipedia:

> In software systems, encapsulation refers to the bundling of data with the mechanisms or methods that operate on the data. It may also refer to the limiting of direct access to some of that data, such as an object's components. Essentially, encapsulation prevents external code from being concerned with the internal workings of an object.

You could use encapsulation to enforce only valid states, but there are many ways to do that.


Well whatever that is, that's what I like :)


Not only network calls, but also parallelism, when that microservice does some processing on its own, or are called from a different microservice as well.

Add to it a database with all the different kinds of transaction semantics and you have a system that is way above the skillset of the average developer.


Out of curiosity I sometimes rewrite things as spaghetti (if functions are short and aren't called frequently) or using globals (if multiple functions have to many params) it usually doesn't look better and when it does it usually doesn't stay that way for very long. In the very few remaining cases I'm quite happy with it. It does help me think about what is going on.


In theory, you could design a parallel set of software engineering best practices which emphasize long-term memory of the codebase over short-term ability to leaf through and understand it. I guess that would be "reasoning nonlocally" in a useful sense.

In practice I think the only time this would be seen as a potentially good thing by most devs is if it was happening in heavily optimized code.


An interesting point. Would there be any benefits to this non-local reasoning?


Not unless you own and run the business, I suspect. You probably buy yourself a much higher absolute threshold of complexity you can comfortably handle in the codebase, but it's not exactly like software developers are known to take kindly to being handed an Anki deck of design decisions, critical functions, etc. and being told "please run this deck for 3 weeks and then we'll get started".

I suspect it's much more common that codebases evolve towards requiring this nonlocal reasoning over time than being intentionally designed with it in mind.


> The main benefit of Liskov's substitution principle is ensure developers don't need to dig into each and every concrete implementation to be able to reason locally about the code.

Yeah, but doesn't help in this context (enable local reasoning) if the objects passed around have too much magic or are mutated all over the place. The enterprise OOP from 2010s was a clusterfuck full of unexpected side effects.


I suspect that enterprise anything is going to be a hot mess, just because enterprises can't hire many of the best people. Probably the problem we should address as an industry is: how to produce software with mostly low wattage people.


The eventual solution will probably be to replace the low wattage people with high wattage machines.


Sure, once they can solve advent of code problems on the second week..


> I find types helps massively with this. A function with well-constrained inputs and outputs is easy to reason about. One does not have to look at other code to do it. However, programs that leverage types effectively are sometimes construed as having high cognitive load, when it in fact they have low load. For example a type like `Option<HashSet<UserId>>` carries a lot of information(has low load): we might not have a set of user ids, but if we do they are unique.

They sometimes help. But I think it's deeper than this. A function with inputs and outputs that are well-constrained with very abstract, complex types is still hard to reason about, unless you're used to those abstractions.

I think it's more accurate to say that something is "easy to reason about" if its level of abstraction "closely matches" the level of abstraction your brain is comfortable with / used to. This can vary dramatically between people, depending on their background, experience, culture, etc.

I could describe the Option<HashSet<UserId>> type in terms of functors and applicatives and monads, and though it would describe exactly the same set of valid values, it has a much higher cognitive load for most people.

> However, programs that leverage types effectively are sometimes construed as having high cognitive load, when it in fact they have low load.

Cognitive load is an individual experience. If someone "construes" something as having high cognitive load, then it does! (For them). We should be writing programs that minimize cognitive load for the set of programmers who we want to be able to interact w/ the code. That means the abstractions need to sufficiently match what they are comfortable with.

It's also fine to say "sorry, this code was not intended to have low cognitive load for you".


100% agree and this not only concerns readability. The concept of "locality" turns out to be a fairly universal concept, which applies to human processes just as much as technical ones. Side-effects are the root of all evil.

You don't see a waiter taking orders from 1 person on a table, but rather go to a table and get orders from everybody sitting there.

And as for large methods, I find that they can be broken into smaller once just fine as long as you keep them side-effect free. Give them a clear name, a clear return value and now you have a good model for the underlying problem you are solving. Looking up the actual definition is just looking at implementation details.


There is an issue of reading a code that is written by somebody else. If it's not in a common style, the cognitive load of parsing how it's done is an overhead.

The reason I used to hate Perl was around this, everyone had a unique way of using Perl and it had many ways to do the same thing.

The reason I dislike functional programming is around the same, you can skin the cat 5 ways, then all 5 engineers will pick a different way of writing that in Typescript.

The reason I like Python more is that all experienced engineers will eventually gravitate towards the idea of Pythonic notion and I've had colleagues whose code looked identical to how I'd have written it.


Python 2, Python 3? Types or no types?


> Proponents of small functions argue that you don't have to read more than the signature and name of a function to understand what it does; it's obvious what a function called last that takes a list and returns an optional value does.

I used to be one of those proponents, and have done a 180.

The problems are:

1. The names are never as self-evident as you think, even if you take great care with them.

2. Simply having so many names is an impediment in itself.

The better way:

Only break things up when you need to do. This means the "pieces" of the system correspond to the things you care about and are likely to change. You'll know where to look.

When you actually need an abstraction to share code between parts of the system, create it then.


Re: trust issues...I'd argue this is the purpose of automated tests. I think tests are too often left out of architectural discussions as if they are some additional artifact that gets created separately from the running software. The core / foundational / heavily reused parts of the architecture should have the most tests and ensure the consumers of those parts has no trust issues!


Tests are good but moving left by lifting invariants into the type system is better.

Compare

   fn send_email(addr: &str, subject: &str, body: &str) -> Result<()>
to

    fn send_email(add: &EmailAddr, subject: &str, body: &str) -> Result<()>
In the second case, the edge cases of an empty or invalid email address don't need to be tested, they are statically impossible.


Thanks for the small concrete example. I try to explain this a lot. It also makes coverage really easy to get with fewer tests.


I may be wrong, but my view of software is : you have functions, and you have the order in which functions are called. Any given function is straightforward enough, if you define its function clearly and keep it small enough - both of which can reasonably be done. Then we have the problem, which is the main problem, of the order in which functions are called. For this, I use a state machine. Write out the state machine, in full, in text, and then implement it directly, one function per state, one function per state transition.

The SM design doc is the documentation of the order of function calling, it is exhaustive and correct, and allows for straightforward changes in future (at least, as straightforward as possible - it is always a challenge to make changes).


Would love to understand this better. Is there any example you could point to?


    init -> success -> red
    init -> failure -> cleanup

    red -> success -> red_yellow
    red -> failure -> cleanup

    red_yellow -> success -> green
    red_yellow -> failure -> cleanup

    green -> success -> yellow
    green -> failure -> cleanup

    yellow -> success -> red
    yellow -> failure -> cleanup

    cleanup -> done -> finish
init/red/etc are states.

success/failure/etc are events.

Each state is a function. The function red() for example, waits for 20 seconds, then returns success (assuming nothing went wrong).

To start the state machine, initializes state to "init", and enter a loop, in the loop you call the function for the current state (which makes that state actually happen and do whatever it does), and that function returns its event for whatever happen when it was run, and you then call a second function, which updates state based on the event which just occurred. Keep doing that, until you hit state "finish", then you're done.


Got it, thanks. But it seemed from your original post that you tend to write state machines a lot more than the usual engineer does, would that be correct? Would you use this in a crud rest API for example?


When writing code, the amount of structure depends on the amount of code.

More and more complex code requires more structure.

Structure takes time and effort, so we write the minimum amount of structure which is appropriate for the code (where code often grows over time, and then by that growth becomes unmanagable, and then we need more structure, which may require a rewrite to move from the existing structure to a new, fuller structure).

So with methods for organizing code, we go something like, in order of less to more structure,

. lines of code . functions . libraries . OO libraries . classes

A state machine is a form of structure, separate from how we organize code, and moderately high cost. I don't often use one, because most of the code I write doesn't need to be particularly rigorous - but for example I did write a mailing list, and that really is used, so it really did have to be correct, so I wrote out the state machine and implemented based on the state machine.

State machines also help with testing. You can keep track of which states you have tested and which events from each state you have tested.

I've never written a REST API in my life, so I can't tell you if I would use a state machine for that :-)


In regards to small functions, I think an important - but not often mentioned - aspect is shared assumptions. You can have many small functions with garbage abstractions that each implictly rely on the behaviour of each other - therefore the cognitive load is high. Or, you can have many small functions which are truly well-contained, in which case you may well need not read the implementation. Far too much code falls into the former scenario, IMO.


I've seen functions called getValue() that were actually creating files on disk and writing stuff.

Also, even if the function actually does what advertised, I've seen functions that go 4-5 levels deep where the outer functions are just abstracting optional parameters. So to avoid exposing 3 or 4 parameters, tens of functions are created instead.

I think you do have a point but ideas get abused a lot.


> Proponents of small functions argue that you don't have to read more than the signature and name of a function to understand what it does;

Although this is often the case, the style of the program can change things significantly. Here are a few, not so uncommon, examples where it starts to break down:

1. When you’re crafting algorithms, you might try to keep code blocks brief, but coming up with precise, descriptive names for each 50-line snippet can be hard. Especially if the average developer might not even fully understand the textbook chapter behind it.

2. At some point you have to build higher than "removeLastElementFromArray"-type of functions. You are not going to get very far skimming domain-specific function names if don’t have any background in that area.

More examples exist, but these two illustrate the point.


Both examples stem from not understanding the problem well enough I think. My best work is done when I first write a throwaway spaghetti solution to the problem. Only through this endeavour do I understand the problem well enough to effectively decompose the solution.


You understand your final fine grained code after your 'spaghetti' intermezzo. Others and your future you, probably less so.


My point is that the factoring and abstractions one produce after the spaghetti intermezzo will be better than a blind stab at them; a greater understanding of the problem helps.


Agree that intermezzos - even of the spaghetti kind - help understanding.

I thought this thread was more about (non) maintainability of code consisting of many procedures for each of which names are to be found that will make their usage self-explaining.

From my experience, simple API's with complex and often long implementations can be very well suited. As long as those are low on side effects and normally DRY, as opposed to puristicly DRY.


This is absolutely the right way to think about things.

I like thinking about local reasoning in terms of (borrowing from Ed Page) "units of controversy". For example, I like using newtypes for identifiers, because "what strings are permitted to be identifiers" is a unit of controversy.


Types are somewhat a different dimension. Sort of the classic 1 dimensional argument about a 2 dimensional problem domain. Which quadrant you’re talking about alters whether the arguments support reality or argue with it.

If understanding a block of code requires knowing a concept that the team feels everyone should know anyway, then it’s not such an imposition. If the code invites you to learn that concept, so much the better. The code is “discoverable” - it invites you to learn more. If the concept is incidental to the problem and/or the team is objectively wrong in their opinion, then you have tribal knowledge that is encroaching on the problem at hand. And whether it’s discoverable or not is neither here nor there. Because understanding the code requires knowing lots of other things, which means either memorization, or juggling more concepts than comfortably fit in short term memory - cognitive overload.

You know you’ve blown past this point when you finally trace the source of a bad piece of data but cannot remember why you were looking for it in the first place.

I’m hoping the problem of cognitive load gets more attention in the near future. We are overdue. But aside from people YouTubing code reviews, I’m still unclear what sorts of actionable metrics or feedback will win out in this arena. Maybe expanding code complexity to encompass the complexity of acquiring the values used in the code, not just the local data flow.


The first step for allowing local reasoning is to break your product into independent subdomains that are as independent as possible.

For a software company, this means crafting the product ownership of your team such that the teams can act as independently as possible.

This is where most companies already fail.

Once this has been achieved, you can follow this pattern on smaller and smaller scales down to individual functions in your code.


Last is something that is embarrassingly extractable, which makes it a bad example (you shouldn't write that function anyway in 99% of cases - surely someone wrote it already in stdlib of your language).

It's like taking "list.map(x -> x*x)" as a proof that parallelism is easy.

Most code is not embarrassingly extractable (or at least not at granularity of 3 lines long methods).


> Proponents of small functions argue that you don't have to read more than the signature and name of a function to understand what it does; it's obvious what a function called last that takes a list and returns an optional value does.

It's also interesting that in comment to the same article many people argue against PR process. I hardly see how else that level of discipline required not to undermine trust in names of small methods can be maintained for any team with more than 3 developers.


I do not agree that typing leads to less cognitive load. Typing often leads to more and more complicated code. Dynamically typed code is often shorter and more compact. If dynamically typed code is well written, its function, inputs and outputs are clear and obvious. Clear and easy to understand code is not primarily a matter of typed or not typed code, it is a matter of a great programmer or a poor one.


There is a function. It takes in 4 parameters. One of them is called ID

Is ID a string, a number, a GUID? Better check the usage within the function.

Oh, the declaration is `id: number`

Mystery solved.

Even better if the language supports subtyping so it is something like id: userID and userID is a subtype of number.


In a dynamically duck typed language it should not matter if an ID is a string, a number or a GUID. The code should work with all of them. The semantically important thing is that this is an identifier. No String, number or GUI data type expresses this true meaning of the value.


It matters a lot even in a duck typed language.

If there are multiple types of user IDs, I don't want to pass the wrong one into a DB call.

This is often the case when dealing with systems that have internal IDs vs publicly exposed IDs. A good type system can correctly model which I have a hold of.

For complex objects proper typing is even more important. "What fields exist on this object? I better check the code and see what gets accessed!"

Even worse are functions where fields get added (or removed!) to an object as the object gets processed.

Absolute nightmare. The concept of data being a black box is stupid, the entire point of data is that at some point I'll need to actually use it, which is a pain in the ass to do if no one ever defines what the hell fields are supposed to be laying around.


By naming the variable ID it is crystal clear what the value is. Most of the time an explicit type only adds cognitive load to the reader, and limits the universality of the code. At an high abstraction level, most of the time a type is from a program logic point of view an irrelevant machine implementation detail. If a specific duck is required it is explicitly tested. This makes code very clear when the duck type is important and when not.


That's how you get a stray string in a column of integers.


Statically typed code definitely requires more effort to read, but this is not cognitive load. Cognitive load is about how much working memory is required. Statically typed code requires less cognitive load because some of the remembering is outsourced to the source code.

Statically typed code can lead to more complicated code; it can also accurately reflect the complexity inherent in the problem.


This is true at smaller scales and flips over on larger scales (larger codebase, dependencies, team/teams sizes).


A function is clear or not. I fail to see how the scale of the code, team, dependence is a factor in that.


I split local reasoning into horizontal or vertical.

Vertical reasoning is reasoning inside a module or function. Here information hiding and clear interfaces help.

Horizontal reasoning is reasoning across the codebase in a limited context; adding a new parameter to a public function is a good example. The compiler helps you find and fix all the use sites, and with good ability to reason vertically at each site, even a change like this is simple.


At work we have a pretty big Python monorepo. The way we scale it is by having many standalone CLI mini apps ( about 80) atm with most of them outputting json/parquet in GCS or bigquery tables. Inputs are the same.

I insisted a lot on this unix (ish as it's not pipes) philosophy. It paid off so far.

We can test each cli app as well as make broader integration tests.


> If someone feels compelled to read every function either the functions are poor abstractions or the reader has trust issues, which may be warranted.

Or it's open source and the authors were very much into Use The Source, Luke!


The larger problem are things that have global effect: databases, caches, files, static memory, etc. Or protocols between different systems. These are hard to abstract away, usually because of shared state.


Weird, I read that between the lines of parent's post. Of course local reasoning precludes global effects.


I feel that one big way in which engineers talk past each other is in assuming that code quality is an inherent property of the code itself. The code is meaningless without human (and computer) interpretation. Therefore, the quality of code is a function of the relationship between that code and its social context.

Cognitive load is contextual. `Option<HashSet<UserId>>` is readable to someone knowledgeable in the language (`Option`, `HashSet`) and in the system (meaning of `UserId` -- the name suggests it's an integer or GUID newtype, but do we know that for sure? Perhaps it borrows conventions from a legacy system and so has more string-like semantics? Maybe users belong to groups, and the group ID is considered part of the user ID -- or perhaps to uniquely identify a user, you need both the group and user IDs together?).

What is the cognitive load of `Callable[[LogRecord, SystemDesc], int]`? Perhaps in context, `SystemDesc` is very obvious, or perhaps not. With surrounding documentation, maybe it is clear what the `int` is supposed to mean, or maybe it would be best served wrapped in a newtype. Maybe your function takes ten different `Callable`s and it would be better pulled out into an polymorphic type. But maybe your language makes that awkward or difficult. Or maybe your function is a library export, or even if it isn't, it's used in too many places to make refactoring worthwhile right now.

I also quite like newtypes for indicating pragmatics, but it is also a contextually-dependent trade-off. You may make calls to your module more obvious to read, but you also expand the module's surface area. That means more things for people writing client code to understand, and more points of failure in case of changes (coupling). In the end, it seems to me that it is less important whether you use a newtype or not, and more important to be consistent.

In fact, this very trade-off -- readability versus surface area -- is at the heart of the "small vs large functions" debate. More smaller functions, and you push your complexity out into the interfaces and relationships between functions. Fewer large functions, and the complexity is internalised inside the functions.

To me, function size is less the deciding factor [0], but rather whether your interfaces are real, _conceptually_ clean joints of your solution. We have to think at a system level. Interfaces hide complexity, but only if the system as a whole ends up easier to reason about and easier to change. You pay a cost for both interface (surface area) and implementation (volume). There should be a happy middle.

---

[0] Also because size is often a deceptively poor indicator of implementation complexity in the first place, especially when mathematical expressions are involved. Mathematical expressions are fantastic exactly because they syntactically condense complexity, but it means very little syntactic redundancy, and so they seem to be magnets for typos and oversights.


Types? "Option<HashSet<UserId>>" means almost nothing to me. A well defined domain model should indicate what that structure represents.


> A well defined domain model should indicate what that structure represents.

"Should", but does it? If a function returns Option<HashSet<UserId>> I know immediately that this function may or may not return the set, and if it does return the set, they are unique.

This is a fact of the program, I may not know "why" or "when" it does what. But as a caller, I can guarantee that I handled every possible code path. I wouldn't get surprised later one because, apparently, this thing can throw an exception, so my lock didn't get released.


it seems like you're just not familiar with the domains defined by those types, or at least the names used here


Are you? What would an option for a hash set of userids? I just don't find that "types" magically solve problems of cognitive load.


> What would an option for a hash set of userids?

As an argument: An optional filter for a query e.g. "return me posts form these users"

As a return value: The users who liked a post or nothing if it's not semantically valid for the post to be liked for some reason.

> I just don't find that "types" magically solve problems of cognitive load.

Cognitive load is about working memory and having to keep things in it. Without types one only has a name, say "userIds". The fact that it's possible for it to be null and that it's supposed to contain unique values has to be kept in working memory(an increase in cognitive load)


Even that means a lot more than `{}`, who's tortured journeys I have to painstakingly take notes om in the source code while I wonder what the heck happened to produce the stack trace...


Yes, but it means less than something like UserGroup. I hear you on {} though, I'm currently looking at "e: ".


Not everything is a functional program though and side effects are important. Types can’t* represent this.

*Not for practical programs


Absolutely with you on the idea in the abstract, but the problem you run into in practice is that enabling local reasoning (~O(1)-time reading) often comes at the cost of making global changes (say, ~O(n)-time writing in the worst case, where n is the call hierarchy size) to the codebase. Or to put it another way, the problem isn't so much attaining local readability but maintaining it -- it imposes a real cost on maintenance. The cost is often worth it, but not always.

Concrete toy examples help here, so let me just give a straight code example.

Say you have the following interface:

  void foo(void on_completed());

  void callback();

  void bar(int n)
  {
    foo(callback);
  }

Now let's say you want to pass n to your callback. (And before you object that you'd have the foresight to enable that right in the beginning because this is obvious -- that's missing the point, this is just a toy example to make the problem obvious. The whole point here is you found a deficiency in what data you're allowed to pass somewhere, and you're trying to fix it during maintenance. "Don't make mistakes" is not a strategy.)

So the question is: what do you do?

You have two options:

1. Modify foo()'s implementation (if you even can! if it's opaque third party code, you're already out of luck) to accept data (state/context) along with the callback, and plumb that context through everywhere in the call hierarchy.

2. Just embed n in a global or thread-local variable somewhere and retrieve it later, with appropriate locking, etc. if need be.

So... which one do you do?

Option #1 is a massive undertaking. Not only is it an O(n) changes for a call hierarchy of size n, but foo() might have to do a lot of extra work now -- for example, if it previously used a lock-free queue to store the callback, now it might lose performance as it might not be able to do everything atomically. etc.

Option #2 only results in 3 modifications, completely independently from the rest of the code: one in bar(), one for the global, and one in the callback.

Of course the benefit of #1 here is that option #1 allows local reasoning when reading the code later, whereas option #2 is spooky action at a distance: it's no longer obvious that callback() expects a global to be set. But the downside is that now you might need to spend several more hours or days or weeks to make it work -- depending on how much code you need to modify, which teams need to approve your changes, and how likely you are to hit obstacles.

So, congratulations, you just took a week to write something that could've taken half an hour. Was it worth it?

I mean, probably yes, if maintenance is a rare event for you. But what if you have to do it frequently? Is it actually worth it to your business to make (say) 20% of your work take 10-100x as long?

I mean, maybe still it is in a lot of cases. I'm not here to give answers, I absolutely agree local reasoning is important. I certainly am a zealot for local reasoning myself. But I've also come to realize that achieving niceness is quite a different beast from maintaining it, and I ~practically never see people try to give realistic quantified assessments of the costs when trying to give advice on how to maintain a codebase.


Initial implementation and maintenance need to keep design in mind, and there should be more clarity around responsibility and costs of particular designs and how flexible the client is with the design at a given point in time. It's an engineering process and requires coordination.


Add a global variable? Let's not go there, please. Anything would be better than that. In this case I would bite the bullet and change the signature, but rather than just adding the one additional parameter, I would add some kind of object that I could extend later without breaking the call signature, since if the issue came up once, it's more likely to come up again.


>I've been thinking about the notion of "reasoning locally" recently. Enabling local reasoning is the only way to scale software development past some number of lines or complexity. When reasoning locally, one only needs to understand a small subset, hundreds of lines, to safely make changes in programs comprising millions.

Have you never heard of the word of our lord and saviour oop, or functions? It's called encapsulation.

You might have learned it through prog langs as it is an embedded ideal


As another sibling comment pointed out there are many tools that enable local reasoning, encapsulation is one such tool.

I'm not claiming the idea is novel, just that I haven't encountered a name for it before.


I'm not saying that encapsulation is a tool for local reasoning, I'm saying they are the same concept.

How is the concept of local reasoning distinct from that of encapsulation?


I think most of us associate the word encapsulation with OOP nightmare code that spread mutable state across many small classes that often inherited from one another and hid the wrong state. Stateless and low state are the reaction to that. If you expand the term to include those aids to local reasoning then many more might agree with you.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: