It Takes 6 Days to Change 1 Line of Code (2015)

hacker_9 · on Dec 6, 2016

I want to fault this but I can't. Code reviewer who's not afraid to reject code? Check. Proper testing of code changes? Check.

The process probably looks silly in this case because nothing went wrong. But with a hard-coded variable being changed, who is to say it won't break some other system by being changed? And if the variable was then moved to a config file, who's to say again that it might not break in some obscure way? Such as if the config file is not found in the uat/test environment? Additionally after all this, the manager can change the variable in the config file as they please without this process need going through again.

justinsaccount · on Dec 6, 2016

The process is fine, but sometimes shit is broken and you need to fix it now and clean up the existing mess later.

Why not instead of

> Shirley (Code Review): It is now against company policy to have any hard-coded variables. You will have to make this a record in the Parameters file. Also, there are 2 old Debug commands, an unassigned variable warning message, and a hard-coded Employee ID that will all have to be fixed before this module can be moved to production.

Add at the end:

However, I'll allow the change for now because this is an important update that affects peoples jobs. I want to see every issue fixed before the end of the week and I will not approve any other changes until these issues are resolved.

There, win-win.

dragonwriter · on Dec 7, 2016

> The process is fine, but sometimes shit is broken and you need to fix it now and clean up the existing mess later.

Working in big enterprise shops, I've noticed that periodically that attitude does take hold, and the result is that the firefighting ends up producing more fires than it puts out.

If you could change most of the people, culture, and incentives in a non-software-centered enterprise all at once (and this typically goes far beyond the IT shop), sure, maybe more fluidity in software development and release process would work better. But that's usually not possible, nor is it usually possible to durably wall off a group within an enterprise to isolate it from the problems of the larger organization (though that does happen sometimes.)

lostapathy · on Dec 7, 2016

Exactly this. I think dev shops are fairly stratified into quality levels. "fix it now and clean it up later" works in a certain class of shop, and that's great.

But there's a huge swath of dev shops that never pay any technical debt back, and the compound interest on that technical debt consumes all productivity. Past a certain point, drawing a line in the sand and forcing better practices no matter how easy it is to rationalize 'one more hack' is the only way to break that cycle.

unoti · on Dec 7, 2016

> But there's a huge swath of dev shops that never pay any technical debt back

Engineers often talk about technical debt as if the codebase is the currency of the realm. It's not. Business Value is the currency of the realm. The code is something that facilitates business value, when (and only when) it does what the business needs. The people working on the code have a customer, the business. The business is locked in to a single provider for that service. And the service provider has grown complacent, prioritizing its own needs and processes ahead of the needs of the customer. They're behaving like other entrenched and untouchable service providers do, in monopoly industries like government and cable companies. Deal with my service levels, or suck it.

There are many customer-obsessed engineers who know how to strike this balance, but this is one of the many issues where there's selection bias in who will take the time to participate in the discussion. People who think this is a powerful story of IT done right need to broaden their minds, stop playing around on hacker news, and start thinking a little broader about how software service providers can do a better job of learning how to say "yes" faster.

dragonwriter · on Dec 7, 2016

> Engineers often talk about technical debt as if the codebase is the currency of the realm. It's not. Business Value is the currency of the realm.

"Technical Debt" is a term that exists largely to communicate the idea that the thing it represents increases the dollar cost of both operating the software to deliver constant business value in a steady state, and of delivering changes to the software to continue to deliver the same value as business needs change or to realize additional opportunities to deliver new value.

It is, absolutely and centrally, about business value; ignoring it in decision-making is exactly like ignoring maintenance/operations costs in a capital purchase and considering only acquisition costs.

SubuSS · on Dec 7, 2016

All tech debt need not be paid back in full - thinking otherwise is indeed a fallacy like you point out.

The catch here though is the fact that rate of business value production is inversely proportional to amount of tech debt. Meaning a ton of yes's today will mean a grinding halt in a couple of years. I have worked on too many (very successful) teams now and watched many others that have gotten themselves into this situation.

After this point, you end in a place where tech debt repayment efforts can take years - but it also coincides with 4-5 year mark when original developers are tired and leave. End product - a giant legacy system operated by a ton of new folk who have no clue what's up. End result: This gets 'deprecated' and replaced by a brand new one.

Ideally I do think data is gold but code needs to be incrementally ripped out and replaced every 3-5 years or so. Else entropy is too crippling on the system.

loco5niner · on Dec 7, 2016

> start thinking a little broader about how software service providers can do a better job of learning how to say "yes" faster.

We don't need more "yes men", we need more people able and willing to say 'Here are the trade-offs', and work with the business from there.

Yes, it absolutely is about Business Value, and the best Business Value is not always "yes" (of course, sometimes it is).

... speaking as someone currently dealing with and seeing the Business Value impact of technical debt.

eitally · on Dec 7, 2016

But for a lot of things, this is actually generally acceptable. Back office business apps, for example, where because of internal politics and attrition, they generally end up being replaced wholesale every 5-10 years, which is approximately how long it takes technical debt-ridden code to become so brittle it's unsupportable. A CIO can save millions by allowing this behavior. Thousands upon thousands of apathetic IT staff (including programmers) make decent livings in enterprises like this.

tw04 · on Dec 7, 2016

5-10 years? In a company of what size, in what industry? For most "back-office" apps you're talking 2 years for the re-write just to see (almost) feature parity, and another year for the users to be trained and feel comfortable with it. If you were ripping and replacing a non-technical users go-to app every 5 years you'd have revolt.

chewyfruitloop · on Dec 7, 2016

We have 2 apps that are used for the same thing.... the original one and the rewritten one. The rewritten one was coded up in C# over 3 years and isn't anywhere near as robust as the original one. The original one was written in VB3 circa 1995 and dragged to VB6 where its stayed to this day.

Theres that much technical debt on the old app that the first change I made to it 5 years ago caused it to stop compiling because of the single line I added to a function which made it to big to compile.

If you work in a shop that has been around long enough to lose its training wheels and get a driving licence, then you have technical debt thats core functionality that everyone is scared of breaking.

switchbak · on Dec 8, 2016

Anecdotal example: The place I was working at had high software quality standards, but eventually created a separate team that wrote in a bizarre XML 'language' of sorts. This was, in effect, still part of the software, and the sins committed there were horrendous, but all was allowed in the name of "business value".

The end result of that being that you couldn't really change _anything_ without it breaking any of N client installs (each custom) - and you couldn't reason about which one touched a particular part of the code, or what assumptions it made.

That company was all but dead inside two years. There were other factors leading to its demise, but far and away the biggest issue was the crippling technical debt that was piled on with reckless abandon.

There are definitely trade-offs to be made, and sometimes it makes sense to take on some technical debt. Your example of back-office apps may be one of them, but there's certainly a subset of software development for which this is both a business and technical-level suicide.

jldugger · on Dec 7, 2016

At least in this example, the entirety of the project involves fixing that "technical debt" of hard coding a variable, and it's being held up for not fixing _enough_ technical debt.

alkonaut · on Dec 7, 2016

> and the result is that the firefighting ends up producing more fires than it puts out.

I usually find that customers

- Prefer getting feature early, even if it's unpolished, meaning ship something broken and fix it later rather than polish and ship something perfect.

- Prefer many, complex features over simplicity/performance in the product.

- Prefer having more bugs, but quick turnaround over having few bugs and a long turnaround.

In all the enterprise scenarios I have worked I have tried to stress the importance of simplicity etc but really all customers have stressed is that they don't mind bugs, crashes or poor performance as long as they can get their work done, and that might depend on that last feature or putting out that last fire.

pjc50 · on Dec 7, 2016

They say that now. But what happens when they list all the bugs against you when it comes to paying the invoice?

dalore · on Dec 7, 2016

They say, thanks for fixing all those bugs. They know the bugs exist because they encountered them. Fixing bugs actually looks like work is getting done.

Unlike if you came back to them with a polished product with no bugs, they would be like why am I spending all this money for this time?

Anthony-G · on Dec 7, 2016

That’s related to the concept expressed by behavioural economist, Dan Ariely: “consumers don’t value goods and services solely by their utility, … but also a sense of fairness relating to how much effort was exerted” [1]. In that quote, I’d probably qualify the word effort as “perceived effort”.

http://danariely.com/2010/12/15/locksmiths/

wila · on Dec 7, 2016

The customer doesn't know about or understand technical debt.

Any improvements you make down there are just racking up on time spent that makes the bills higher which they will notice. They do not see your improvements, but they will notice it if your much betterer code now introduces a new bug.

Higher bills as your predecessor, slow in fixing and new bugs. That is the moment to worry about getting bills paid, not for fixing fast.

alkonaut · on Dec 8, 2016

The thing is, the existing customers are locked in (or we failed).

The customers we want are those that aren't customers yet. Those customers require features that we don't have yet in order for them to become customers.

It might be the right decision as a business to screw the existing customers by aggregating tech debt in order to grab new market share before the competition does (again this is in a context where a customer has no chance of leaving once on board - which is more or less the case in all enterprise scenarios I have experience from, both web and desktop).

I hate this mentality and as a developer it's terrible - but I can't say that it's the wrong decision to make short term, business wise.

wila · on Dec 21, 2016

Guess I simply do not share your view. My customers are not locked in, in fact I will go out of my way to make it easy for them, their employees or even another consultant to continue on any work of the work I do and have done for them in the past.

In addition I am not in favor for aggregating technical debt, I suppose I wasn't clear. The article talked about a 1 line fix and that by itself should not trigger a rewrite of a feature, especially not when the fix has to be done now, not next week.

alkonaut · on Dec 7, 2016

> But what happens when they list all the bugs against you when it comes to paying the invoice.

I have never seen that happen, but then I have never worked on invoiced/project work, only long term enterprise/product dev. Customers license the software, but I suppose there could be some SLA type agreement also for desktop software (which happens to be the case).

nashashmi · on Dec 7, 2016

OT: I was about to post that women (Shirley) are bossy, but then I pretended that the named individual is Sam and the entire tone of the comment shifted immensely. It was as if the big boss was making this comment. And big bosses are never looking into things of this much detail, which gave more importance to the task.

There is something that isn't right here. I am not sure why I am feeling this.

abritinthebay · on Dec 7, 2016

Welcome to the scary world of recognizing your own biases!

It's a good thing.

smallnamespace · on Dec 7, 2016

It's not Shirley's job to make an executive decision, like whether to break policy or not.

If the company policy should be ignored, then the issue should have been escalated to someone with the authority to make that decision.

IMO the real problem is that the submitter didn't realize that was the case and so didn't escalate sooner. This could have all been avoided by asking for (and getting) carte blanche from Philip or David in the first place.

Cthulhu_ · on Dec 7, 2016

And yet, if everyone is aware not only of company policies, but the underlying principles, everyone also has the authority to deny something. Hardcoded values, for example, would violate a principle about maintainability.

Talk about policy and it's politics and escalation and such. Talk about principles and it becomes a moral issue.

Alternatively, think of it as breaking a specific law (such as the speed limit) vs being immoral in a more general sense of the word.

smallnamespace · on Dec 7, 2016

I think making this a moral issue slightly misses the point. Even if Shirley understands the underlying principle of maintainability and is motivated to do the best thing for the company, she is still not the right decision-maker because the relevant risks and rewards are not fully visible to her.

Decisions should be made at the lowest level where all the relevant information is available. In this case, the executive understands the pressing reasons for making the change, so is properly able to weigh the incremental risk (with the technical advice of Shirley et al) vs. the business priority, whereas Shirley can only see half of the picture.

Imagine if Shirley had gone ahead and broken policy, then something went badly wrong. She likely would've been blamed for the whole fiasco.

testudovictoria · on Dec 7, 2016

> Imagine if Shirley had gone ahead and broken policy, then something went badly wrong. She likely would've been blamed for the whole fiasco.

This is what I was looking for in this entire thread. It's one thing to push through a small change in a small internal program for something that isn't critical like a batch job that clears old data from a server. That's not what this is.

It changes the potential output of the entire company. Yes, it's a single LOC totaling out to a single byte. Yes, it passed all of the tests. However, tests are written by people, and tests are only as good as the people who write them. This is partially why there are human eyes in the QA/IT/UAT phases. Shirley would have been viewed as the single point of failure had something go wrong, because she made the decision to break policy.

icebraining · on Dec 7, 2016

Yeah, if I was Shirley, I'd just have suggested "why don't you ask David to overrule this one?", but definitively not overruled it myself.

jeroen · on Dec 7, 2016

If I was Ed, I would have grabbed David and told him to tell Shirley to allow the change as is. Probably before submitting it for review.

absrnd · on Dec 7, 2016

Because everything is urgent, all of the time.. and so nothing gets improved.

aiiane · on Dec 7, 2016

This is what escalation is for - the job of a VP or Director of Engineering is to decide what is actually urgent and approve fast-tracking it, and what should actually follow process. Having an efficient escalation process is critical.

dsr_ · on Dec 7, 2016

Very much this: there should be a process for "emergency change" and that process has to be used for every actual emergency and not for anything else.

chewyfruitloop · on Dec 7, 2016

I've found that those in charge also have a budget to run you on... how do they justify you spending x hours updating code to do exactly 0 as far as anyone is concerned? ..plus all the regression testing? They just don't... it gets left as it is

droopyEyelids · on Dec 7, 2016

The word "should"

ErnestedCode · on Dec 7, 2016

"I know this is priority 1, but is it priority 1A, 1B, or 1C?"

ryandrake · on Dec 7, 2016

Every numeric priority scoring system I've ever worked with at every company I've ever worked with has devolved into this: Start with P1 (high), P2 (medium), and P3 (low). Sooner or later, an exec will swoop in and declare something Even Higher Priority Than P1!!! So, "P0" gets invented, and after some time, you have tons of P0 issues, meaning P1 is the new "medium", P2 is the new "low" and P3 becomes "never". Then something comes in that is Even Higher Priority than P0!!! And now, you're stuck.

For this reason, I always advocate reversing the numbers. The higher the P number, the higher the priority. So you never run out of numbers to express how much more urgent and earth shattering your next bug is. Try it!

umanwizard · on Dec 7, 2016

I worked at a big company that had a big data warehouse that you could submit jobs to: these jobs could be given one of five priorities from "lowest" to "highest". Naturally, everyone set their jobs to "highest", so jobs that generated some weekly email report that nobody really read were competing at the same level as jobs that were actually business critical (e.g., necessary for ad auctions to work correctly or whatever).

So they introduced a new priority level: "Critical". The pattern repeated itself in short order.

It was a tragedy of the commons thing, and eventually it was literally impossible for a job to ever finish if it was marked anything other than "Critical".

So they introduced "Omega Critical". Now you needed to be on a permissions list maintained by the database team to raise something to "Omega Critical". I got on the list (because I was scheduling actually high-priority stuff). Then everyone in my org would come to me and ask me to set random bullshit to Omega Critical, because they knew I had permissions and "one more job won't hurt"...

I don't work there anymore, but I believe they have since developed even more weird categories like "Q4 Omega Critical" and who knows what else.

zhte415 · on Dec 7, 2016

I've seen 2 pretty good solutions to this.

1. VP bangs head with equivalent VP in other department, complaining there are too many requests. Can work, depends on individual.

2. Actually have a service level agreement. Requests are limited to X per hour/day/week/whatever the process needs, and billed accordingly. Have a bit of buffer in this, a little creative billing. Requests within SLA are dealt with in agreed turnaround time. Have a dashboard that clearly shows this to customers. Alert customers if they're consistently submitting what's outside of SLA, and they will receive a degradation of service unless they provide further funding.

Everyone knows if you pour too much water unto a cup it will overflow.

mdpopescu · on Dec 8, 2016

Having a budget for each department and a cost for running the queries is a great idea... I'll try to implement it the next time I'm in this situation. (I have also had a ton of VPs - plus the CEO and the owner - come to me with "very urgent now now now!" requests... it would have been great to tell them "the top priority for me is this query worth $1000, can you beat that?)

zhte415 · on Dec 9, 2016

Sure. That's how external service providers work - being internal is in the buffer, you can 'do a favour' when needed. If it is the owner making an ASAP!OMG! request perhaps take the chance for a chat and to demonstrate you have a system and that you're not only managing, but taking a stance of leadership (and point out that as it is internally charged, they're not losing any cash).

aserafini · on Dec 7, 2016

Data warehouse queries at Amazon worked like this (5 levels of priority). Everyone set the priority for all of their queries to the highest possible level, every time.

flukus · on Dec 7, 2016

This is what I like about kanban, there is a queue, priority is not a status.

user5994461 · on Dec 7, 2016

Remove all priority levels.

Put a checkbox. "This is an emergency". People are suddently less likely to click it.

And anyway, if it really were, they'd have called so it still doesn't matter but it makes them happy :D

msiebuhr · on Dec 7, 2016

I worked at place with such a button. Neat caveat: only VP's could unchecked it...

(And I managed to check it within my first month there. My boss kindly let me know the difference between a client saying emergency and a VP saying so....)

15thandwhatever · on Dec 7, 2016

I've seen where through general abuse and repeated, early dismissal of bug reports, people are extremely reluctant to call anything an emergency ... even actual emergencies.

mattclarkdotnet · on Dec 7, 2016

I'll never forget the company I consulted at where the priorities were harder and harder to say with a straight face.

To get to the front of the queue, you had to mark your request "hot and urgent for the whole team"

zachrose · on Dec 7, 2016

I just got a sudden bitter taste in my mouth when reading "P0" and remembering the people who used to say it out loud; is that normal?

Another great thing about inverting the priority numbers is that nothing can easily go beneath a floor of 0, though ultimately all of this is about fixing the symptom and not the cause.

devonkim · on Dec 7, 2016

The reason for the numbering system possibly comes from the ITIL system where priority is determined as a product of two factors - impact (scope) and urgency. These definitions in most IT orgs I've seen regardless of how poorly executed or immature the org is seems to be pretty consistent and oftentimes include gating factors like other comments in this subtree have been discussing. If some of the most immature and byzantine IT processes can get this nonsense settled, engineers can avoid the bike shedding and goalpost moving, too if they actually put some effort into a sensible system and enforced it.

KayEss · on Dec 7, 2016

Order them, don't bucket them. That is, the highest priority is head of the queue, and the lowest is at the bottom.

hvidgaard · on Dec 7, 2016

I have a different issue. Everything kindof always gets promoted to a must have. So now I've made 2 priorities, need-to-have and nice-to-have. For any given collection of work that needs to be done there can be no more than 60% need-to-have estimated work hours. When the deadline is accepted by my team, it means that all need-to-have are done at that deadline, and nice-to-have probably are, but no gurantees.

Every single fucking time the PO thinks the deadline is too late, she tries to remove nice-to-have issues - sorry, but that doesn't change the deadline.

dagw · on Dec 7, 2016

I former co-worker of mine told me about their priority tracking software where you could enter a number from 0-5 for each bug, where 0 was the highest priority and only certain senior managers could set a priority of 0.

However people soon discovered that priorities where stored as a float in the database and they could enter any number 0<n<=5 in the field. So bugs quickly started to get priorities like 0.5, 0.25, 0.1, 0.05 and so on...

azernik · on Dec 7, 2016

This is inevitably what happens when the people reporting the issues get to decide their priority (I've seen this even in a two-person dynamic). You have to have discipline, and you usually have to have an impartial outsider (to the issue at hand) who doesn't care about other people's panic to actually set priority levels.

aiiane · on Dec 7, 2016

I've seen systems that have two fields, e.g. priority and severity. Where severity is "how important the client thinks it is" and priority is "how important we think it is".

Pxtl · on Dec 7, 2016

Our system has another one: Who it impacts. Does it impact a user, a department, or the entire organization? The difference between "my crap's broken" "our crap's broken" or "everybody's crap's broken".

azernik · on Dec 7, 2016

Heh. I like that distinction.

tw04 · on Dec 7, 2016

No, sooner or later an exec swoops in with "this is our largest customer!!!!". It's still a P1, it just gets bumped to the front of the line - because revenue. Nothing wrong with that, and no need for a different numbering system. Their P1 is no different than a smaller customers P1. They just get to be first in line because they pay to play.

douche · on Dec 7, 2016

Or what is even more likely, in my experience, "this could be our largest customer!!!!"

Sales-driven-development is lots and lots of fun.

Kankuro · on Dec 7, 2016

I once had a priority list with a few items had a "low" priority, the rest being "high", "very high" and "extremely high". It gave me a depressing feeling because we had to delay stuff that was supposedly "high" priority for someone.

imtringued · on Dec 7, 2016

The problem with priorities is that multiple issues can have the same priority. If you give every issue it's own priority number then you no longer have this problem.

edblarney · on Dec 7, 2016

Hire P4 interns or contractors :)

If possible.

quicklyfrozen · on Dec 7, 2016

And let me guess... everything priority 2 or lower means 'never' :-)

user5994461 · on Dec 7, 2016

  UPDATE jira SET priority=nil WHERE 1=1;

Here's your bugfix.

NegativeLatency · on Dec 7, 2016

Or priority zero.

kozak · on Dec 7, 2016

The best solution against "priority inflation" that I ever saw: support request form of the DreamHost web hosting provider had priority levels with vivid descriptions (examples) of their meaning, to the extent that the top priority level was called "people are dying". I doubt anyone dared to select that priority unless people were indeed dying.

user5994461 · on Dec 7, 2016

Nah. Nothing is ever urgent.

Just take all tickets that are "urgent", remove the urgent and put them at the back of the queue.

collyw · on Dec 7, 2016

Story of my last few jobs.

bluejekyll · on Dec 7, 2016

This is quite literally how technical debt is created. The story said that there was already a set of bugs being worked on. Which means that by adding this, the technical debt backlog would only continue to grow.

I worked at a company where the technical debt was so deep that I really wanted to say that it was technically bankrupt, meaning that the software should be entirely rewritten. This goes against every grain of my experience to do, and we ended up not declaring technical bankruptcy, but if you want to understand where it starts, it's, we'll do it later.

Each of those decisions needs to be accounted for just like a credit card purchase.

RijilV · on Dec 7, 2016

I think that's the key - let's not forget what's at state with this story: laying off people. What's the price of extending the tech debt of this core code without 'proper' unit tests, hardcoded values, and employee-IDs in code? Is that greater than laying people off?

It is easy to forget what the real world looks like when staring at a screen all day. We take out tech debt like a mortgage - sometimes that debt doesn't make sense and we end up having to foreclose on it, other times we're able to pay it off slowly over time and it all works out.

Where the analogy to debt falls down is that sometimes you luck out and don't have to repay tech debt. Maybe your legacy manufacturing system gets replaced with something modern and all of that horrible code magically vanishes - maybe your single threaded software from the 90s is finally outdated and you need something modern and would have had to rewrite it anyhow.

badlucklottery · on Dec 7, 2016

>Where the analogy to debt falls down is that sometimes you luck out and don't have to repay tech debt

Just call that "winning the tech debt lottery". Much like winning the real lottery it definitely happens but not enough to rely on it.

bmj · on Dec 7, 2016

It seems like the code review policy should accommodate for scary legacy code.

What happens if the findings are fixed later in the week? Are they promoted, too? Are large swaths of the system regressed because of the changes? The process should allow the reviewer to say "oh, this is legacy code--the actual change looks good, and I won't worry about the code from ten years ago that doesn't conform to the standards we created a year ago."

mannykannot · on Dec 7, 2016

Fixing code to meet more recent standards has a nontrivial chance of introducing a bug. If it is to be done at all, it should be done on its own account, and not piggybacked onto something with its own goals.

No matter how hard the process mavens try to reduce development to painting-by-numbers, good human judgement is the key to success.

protomyth · on Dec 7, 2016

Agreed, as that is the big issue. Anyone who requires a patch to production include more than the absolute minimum code change has a broken process. You make the next release bring that code up to the new standards when you can do a much fuller test.

ScottBurson · on Dec 7, 2016

Yes, totally agree. Far more risk is introduced by the cleanup than by the original change. The cleanup is still important, but shouldn't be fast-tracked.

gsam · on Dec 7, 2016

I get the intention here, but sometimes code really needs to be touched. Otherwise you get in the modern day, some-variant-of-Whitesmiths code that nobody can read, or wants to read ever. If you separate it out, business processes usually impede on your ability to ever get to the spun-off task and then it just gets forgotten.

0xffff2 · on Dec 7, 2016

With that attitude all code is legacy code the day it is created, because it will never be updated to conform to future standards.

bmj · on Dec 7, 2016

That's a reasonable point, but how do you (as a product owner) justify the cost of making those fixes (and doing the regression testing necessary to ensure the system still works properly)?

My group supports several legacy applications (10+ years old). The product owners can barely be bothered to green-light bug fixes. I can't imagine they could find the justification to green-light tech debt fixes.

couchand · on Dec 7, 2016

I think the language of technical debt can actually be very meaningful to business. It's fair to take on debt in the short term to make initial gains, but you need to have a reasonable perspective about the ongoing cost that debt has, and pay it off on an ongoing basis if you can't afford the debt service.

If it's a decade-old application that rarely even gets bug fixes, it's probably not worth doing major refactorings. On the other hand, if it's your core product, and your team is regularly making updates and fixes, it would be irresponsible to not incorporate your team's latest standards.

Also, it's worth investing in test automation, because the balance really shifts if you can minimize the cost of regression testing by automating it.

kissgyorgy · on Dec 7, 2016

That's the worst thing you can do. The change will be forgotten and will never be properly worked out. Why should it? It just works.

I have seen a 270k loc codebase where every change has been made by this kind of thinking. Nobody ever went back to make the "temporary" changes properly.

Taek · on Dec 7, 2016

It depends on the project. If this is critical code, 9/10 times it might not be an issue to let it slide in favor of getting the release out early, but that 10th time is a disaster. Weird things start happening and you don't know why and the team dealing with the fallout may not even realize which change caused the issue.

So now there's a lot more disruption and a ton of frustrated people throwing way too much energy into finding what ultimately ends up being a one line mistake.

After a few rounds of that you decide that if it's going to take 6 days, it's going to take 6 days.

raverbashing · on Dec 7, 2016

Open a bug for the other changes and change only what's needed (unless they're literally in the same line and have no other secondary impacts)

gaius · on Dec 7, 2016

Right - the idea that a QA person would happily jeopardize people's livelihoods over a box-ticking exercise is why that profession gets a bad reputation...

Pxtl · on Dec 7, 2016

> However, I'll allow the change for now because this is an important update that affects peoples jobs. I want to see every issue fixed before the end of the week and I will not approve any other changes until these issues are resolved.

This goes in the backlog and gets pushed down further and furhter and further until it's backlog clutter that nobody even notices until that particular chunk of code no longer exists.

chewyfruitloop · on Dec 7, 2016

>The process is fine, but sometimes shit is broken and you need to fix it now and clean up the existing mess later.

when did you ever get to this land of milk and honey called later? whenever you have a jfdi because its broken, the way its coded always stays....

I've never seen a manager sign off on reworking code that does the job, just to make it "more manageable" or "standard" or "less of a bodge"

droithomme · on Dec 7, 2016

> The process is fine, but sometimes shit is broken and you need to fix it now and clean up the existing mess later.

A problem is that most minor changes in the big picture are thought of as being critical things that need to bypass the process in the short term.

It's not just this case that seems to make sense. It's that there's dozens or hundreds of other similar critical patches that need to bypass the process.

ArkyBeagle · on Dec 7, 2016

What is to keep Shirley from simply filing up to four new tickets? Isn't that better?

flukus · on Dec 7, 2016

Shirley keeps finding problems everywhere, she's not a team player.

smallnamespace · on Dec 7, 2016

Can you explain how finding problems is the opposite of being a team player? How is a team supposed to get things done if there isn't someone to recognize problems?

flukus · on Dec 7, 2016

No I can't. It's just an attitude that I often see from management.

ArkyBeagle · on Dec 8, 2016

Management is almost never incented to understand defects.

codycraven · on Dec 7, 2016

Not only not a team player but Shirley is also a hypocrite. I guarantee there are non-standards based code touching the refractors she demanded at some level.

ArkyBeagle · on Dec 8, 2016

sundvor · on Dec 7, 2016

This smells. I've seen plenty of the first part, fix it quick, but rarely any of the latter, fix it properly later. And the former stacks up rapidly as weeks turns to months turns to years.

justinlaster · on Dec 7, 2016

>The process is fine, but sometimes shit is broken and you need to fix it now and clean up the existing mess later.

Because "later" might end up actually being in the middle of the night, or later might not ever come. And then the mess might be magnitudes bigger than what you were facing before. Just wait, now. You have the time. Always. Unless your platform/software is continually crashing, your business always has time. Changing code like that without review is begging for you to "not have time" and cause an undue amount of stress for everyone involved.

Be an engineer, not a monkey. We're all only human and mistakes will be made.

sambe · on Dec 7, 2016

I find this attitude on code reviews to be extremely demotivating, especially on code bases where I am not the primary committer or lack overall responsibility for the system being changed. It completely distracts from the code under review, and it only requires a small amount of social/organisational ill health to be abused.

Even if you believe that the next committer should clean up, it can require quite a big context switch - either the clean up explodes into quite a big task and you have to question the trade-offs being made in doing it now, or it doesn't have full focus and becomes risky and/or no real improvement.

I don't happen to believe that the next committer should clean up in most of these situations. Minor things, sure, you should do it (but don't side-track the review with it!). In my experience, people who already couldn't be bothered to do it or clean up later use a new commit as an excuse to get their work done for them. I've often seen people criticising tangential code issues in review who were responsible for the original commit. The question becomes - if it's such a minor issue for me to clean up, why wasn't it a minor issue for you to clean up? If it's bigger, how did it pass review in the first place?

My view is that extraneous cleanup should be proportional to the original change at most - and not discussed in code review of unrelated changes. If a particular area of the code sees sufficient churn to accumulate more and more debt, everybody will know and talk about it, it will be clear that some specific refactoring is necessary, and more importantly it will be evidence that small cleanup is worthless (high churn) and a better architecture is valuable (high churn). A given team may have high tolerance for technical debt and thus never do this work. But then they are screwed anyway, why create noise at every review just hoping that someone else solves it?

Remember, there's also a good chance that the mess stays the same as it always was, sitting there passively. And there's a chance that the cleanup introduces unanticipated problems, especially if it's legacy code with no/less testing.

justinlaster · on Dec 7, 2016

Are you addressing my statement or someone else's? I don't understand how going through with code reviews for any change is somehow distracting from "code under review."

And surely addressing the code you're touching isn't "extraneous" cleanup. You're modifying the code, you can do it bit by bit.

nostrademons · on Dec 7, 2016

I think he's addressing this line from the original article, which sparked this whole sub-thread way up-thread:

"Also, there are 2 old Debug commands, an unassigned variable warning message, and a hard-coded Employee ID that will all have to be fixed before this module can be moved to production."

Basically, it's the idea that if you touch a piece of code, you have to fix everything else wrong with it, even if that involves touching other areas of the code that you weren't looking at initially and may not even have familiarity with.

This is "the perfect is the enemy of the good", applied to code refactoring. If your policies put people on the hook for fixing mistakes that were made by other people a long time ago every time they touch code, there is a very strong incentive to never touch code. You get a bunch of hyperactive software wizards that shit out a working product (with a bunch of crap code), put it on their resume, and then move on to another employer. The people hired after them have a strong incentive to never do anything (because touching code will require that they touch a bunch of other stuff that they don't understand and have no interest in dealing with), and so they browse Reddit and Hacker News all day long. The code remains crappy, but its crappiness can't be blamed on anyone except folks who are no longer with the company.

Actually, this is a pretty accurate description of the software industry.

st3v3r · on Dec 7, 2016

Cause when you say, "You can fix it later," later never comes.

Spooky23 · on Dec 7, 2016

"The CTO mandated that we reformat the TPS reports by Friday. We'll add the change next week."

ubernostrum · on Dec 6, 2016

Yeah, this is the kind of "everybody's empowered to push back and insist on doing things correctly" environment that people normally praise. Or at least, they praise until they realize that people will push back and insist on quality, and then they decide it's stifling and prevents them from getting anything done.

_csoo · on Dec 7, 2016

The problem is when the push back is not standardized. I've seen bad code go through with comments like "we can deal with these things in a later PR or two" but then other code gets flagged for "it fixes one problem but it could fix three other things too" which increases scope. An inconsistent standard is better than no standard but it's just as frustrating.

user5994461 · on Dec 7, 2016

They're not pushing on quality. They are plainfully pushing against it and deliberately risking to introduce more bugs.

jocro · on Dec 7, 2016

It's all a balance though, right? And really, about the flexibility of the environment. You have to be able to recognize and deal with a broad spectrum of priorities - this, at least, was not prioritized highly enough across the environment and that meant it wasn't handled gracefully.

zzalpha · on Dec 6, 2016

My reaction exactly.

And 2 days were lost to escalation while 2 were spent in test. Everything else didn't look crazy to me if the change is small but has a high system impact.

For an industry that is implicated in downed spacecraft, massive security vulnerabilities, and a reputation for unstable products, it's amazing how happy we are to optimize for developer satisfaction over all other metrics (I'm looking at you, Agile).

Edit:

And before anyone says it, I'm not claiming this is a good process. But coding velocity is just one metric by which we should be measuring our success.

RandomOpinion · on Dec 7, 2016

>...it's amazing how happy we are to optimize for developer satisfaction over all other metrics (I'm looking at you, Agile).

Developer comfort is not a goal of agile; the goal of agile is always having the project in a known good state.

Any developer comfort is a side effect of the prerequisites for having the project always be in a known good state.

mikekchar · on Dec 7, 2016

I don't really disagree with you, but it is not unreasonable to come away with the point of view that the parent had. The agile manifesto has caused as much harm as it has good, I think. People use the manifesto as a methodology, rather than using it as a way to reflect on a methodology. In the extreme, some people will use the manifesto as an excuse to not only avoid a methodology, but a method as well! As long as my intentions are pure, then I can do no wrong, kind of thinking.

Having said that, I think to say that "the goal of agile is always having the project in a known good state" is simplifying things a bit too much. A very important part of agile is preferring artefacts that are executable over artefacts that are not.

More relevant to this discussion is the idea that people should be making judgements rather than relying on hard coded rules of process. If someone has made the decision that the risks of doing something is not worth the benefit, then avoiding the problem is not a waste of time. Importantly, though, a process or rule can not make judgements -- only people can. Following a rule blindly certainly can waste a lot of time.

Setting up a team is not easy. If you put yourself in a situation where every single person on your team can and will press the big red "Stop The Presses!" button, then you only have yourself to blame. Every person on a team should have responsibility appropriate for their experience, skill and view of the situation. If you have disputes, then you don't want an arbitrary process determining the winner -- you want a real, thinking, feeling human being to make the judgement call.

I'm rambling on here, but most of the breakdowns I see in teams come from the idea that either employees are stupid and so we should shield ourselves from them with layers of process, OR the idea that my colleagues are stupid and so I should make sure to politically rig decisions so I don't have to talk to them. This is basically guaranteed to result in failure.

Pxtl · on Dec 7, 2016

Agile isn't about developer comfort, its about the fact that nobody knows what they want until they see it and decide that they hate everything about it. It's about the fact that requirements are made of mercury. E-signatures on specs aren't worth the e-paper they're written on, and the stakeholders want it yesterday with today's changes to the requirements.

Agile isn't a process so much as it is conceding defeat on the very concept of process.

cestith · on Dec 7, 2016

I like the mercury metaphor. Here's a another one for Agile.: Don't build a stone fortress for a nomadic tribe.

If someone wants to give you a full specification and stick to it, you can build something fully specified. If they give you a vague idea that will change a lot, build them as little as necessary and make it flexible enough to change with the idea. If your code's going along for the trip, it may as well be easy to pack.

dreamfactory2 · on Dec 7, 2016

we should measure value, that is all. and that's what agile is supposed to aim for, not developer comfort. these processes offer negative value (prohibitive costs of change and inability to execute in time which massively outweigh any perceived security). even a vaguely leaner organisation will eat their lunch.

and that error-proofing is entirely imaginary - they have added a ton of moving parts, handoffs, and manual processes and so will have way more screw ups than a small tight team practising continuous delivery, as anyone who has worked in such environments will know.

delinka · on Dec 7, 2016

"value." I'm getting this BS from a PM right now over a refactor job that needs to happen. Sustainability and tech debt be damned. Every time they want to support a new file type, it's editing several files, most of which are source, recompile, put QA resources on it, get IT to sign the binaries ...

If you'll get off my back, let me refactor and get my "developer comfort" on, you'll never have to involve me or QA or IT in supporting Just One More File Type ever again - you can push that from policy on the server side. But "developer comfort" isn't "value for the customer."

_sigh_

BurningFrog · on Dec 7, 2016

Why is it the PMs business if you refactor the code or not?

That's a technical decision that they shouldn't have to know about and are not qualified to make any decisions about.

Or put in other words: The PM should be your customer, not your boss.

delinka · on Dec 7, 2016

Indeed. This is the actual problem that needs solving.

CydeWeys · on Dec 7, 2016

Just do it. Seriously. I no longer have a job where I have to put up with this bullshit, but I did at previous jobs, and eventually I just started doing the tasks that needed to be done to make code nicer. I'd spend roughly half my time doing necessary tackling of technical debt. I even got thanked in the end, when new features post-refactor ended up not being the shitshow that they always had been prior. People don't know what they want until they have it. Lead them to the promised land.

girvo · on Dec 7, 2016

Yeah, I've had much more success "leading by example" than by trying to fight against the tide the "correct" way.

dismantlethesun · on Dec 7, 2016

> Every time they want to support a new file type

Why can't you say "we never designed the system to support more than X file types, so if we want arbitrary additions we need to take Y time to change the system to support that. Thus far, there have been Q requests for changes, and in my professional opinion we could save time now, and in the future if we make adding arbitrary filetypes to the feature set this sprint".

This places the refactoring as something necessary to accomplish business goals, and save time/money in the future while providing better service to the internal or external clients using the software.

mikekchar · on Dec 7, 2016

I'll give you a slightly different perspective than the other replies so far. The problem here is not a lack of communication with the product manager. The problem is actually that decisions are being made at an inappropriate level.

As a developer, your goal is not really to write working code as quickly as possible, your goal is to provide the highest possible throughput of working code.

Consider the case of delivering A as fast as possible, B as fast as possible and C as fast as possible. Then contrast that to delivering A,B, and C as fast as possible. These are different problems. Your product manager may or may not understand this point, so ensuring that they do would be my first step.

To do this, I usually use the metaphor of cooking. You can cook scrambled eggs very quickly, and can save time by not putting things away or washing up. You can then make a salad -- again you will save time by not cleaning or washing. Finally you can make soup. But by the time you get to soup, your board is full of unnecessary ingredients, your sink is full of dirty dishes, you have no room on your burner for the pot, etc. Professional chefs clean as they go. Each task takes a little bit longer, but their overall throughput is massively higher (the Ratatouille scene where she says, "Keep your station clean... or I will kill you!", is a good visual to keep in mind).

But this isn't enough. You product manager, once they realise this, will try to "optimise your experience". They will try to group tasks that they think will make you go faster. They will try to drop functionality which they think will make you go slower, etc, etc. Above all, they will keep asking you, "Can we make this change without having to refactor the code?".

In this way, you are going to start restricting what you can do and you will make your code more rigid because you are always treading around the boundaries and never fixing fundamental problems in the center. This is a sure fire way of making you go much, much slower and making your customers very unhappy (since you never do anything of any substance).

So what you need to do to remedy this situation is to cooperate with your product manager and agree where the appropriate place to make decisions is. For example, your product manager is the only person able to make a good judgement about what the customers actually need/want. It is usually hard for programmers to trust their product managers in this regard (because we tend to think of them as morons who are only pushing a schedule so that they will look good and get promoted). Making sure that your product manager understands that you depend on them for this aspect can go a long way to building a bridge. You have to create the idea that you are a team, working together to fulfil the same goals, rather than adversaries trying to achieve different things.

In exchange, the product manager will have to trust you that technical decisions will have a good outcome. If you tell them, "I'm going to take a detour", they have to understand that's because it will make things go faster -- you can see a roadblock ahead and pushing ahead will kill your throughput. This decision is yours, and yours alone. It is a matter of trust, though, so it will take time for your product manager to build their trust in you. Feel free to have a frank discussion about this and explain that it's difficult for you to trust them too. But when both sides trust the other, then there will be a much better outcome.

Now there is one last thing: emergencies. Occasionally, it doesn't matter whether there is a road block ahead or not. This is not your call. And as much as there are asshole product managers who will pull the "this is an emergency" card every 5 seconds, that's their call.

If you think the emergency card is uncalled for, balance your feeling for fixing the situation against the similar issue where you want to delay production to refactor code. That's going to be your call. Your product manager is going to be thinking, "Is it really necessary????" and your goal is to engender trust. It is a give and take, but your only route to success is building that trust.

In the end, if you find that you just can't trust your product manager (or they just can't trust you), I would talk with your management. If you still can't find an acceptable solution, then looking for a better situation elsewhere is probably your best option. Some people want to work in a broken way. You don't have to though.

delinka · on Dec 7, 2016

"...the product manager will have to trust you that technical decisions will have a good outcome."

This is precisely the problem. PM refuses to believe engineering when we say "You can have feature F on date D" rather than on date D-21 when PM thinks they 'must' have it. ("must" in this case is also not backed up with data, but random desires of potential customers that come up in sales presentations...)

EdHominem · on Dec 7, 2016

Of course they don't believe you though. If you have any experience you cook those numbers to deal with people like them!

I find fixed-length sprints harm agile. You're tempted to suggest sprint-length goals, not the minimum useful product. This leads to distrust, because everything looks like estimate stuffing, yet you being honest intend to fill those days doing cleanup, automation, etc. All real work. And you resent every effort to trim even a minute.

Usually even the surliest customer can be worked with by simply cutting them in on more of the planning (you do bill for meetings, right?) and letting them decide how far to push a "spike", etc. When they see their demands are heard and turned into work (and billed for!!) they often become much more open to suggestion, such as to stop beating an expensive dead horse.

The goal is to be able to break down a project into functional goals (index the data, query it, script that, schedule that, add a GUI, etc.) After the first step or two you're producing value, on the way to the goal.

One of the hardest problems in development is finding a competent customer for a PO, who can drive the project with requirements and political support. The more tangible you can make the benefits early on, the higher-level stakeholders you'll have access to and the less trouble you'll face getting things done.

mikekchar · on Dec 7, 2016

As I said, there needs to be a mutual trust here. The "I must have this on date D" is the PM's call, not yours. You need to trust them. If you can't, then it's already game over. The problem is not so much that feature F needs to be done on date D (if, indeed, it turns out to be possible). The problem is when every request has a date attached to it. This is where you need to negotiate. Very probably they think that if they don't attach a deadline to the task, then it will never get done. Management training has historically taught just this. Similarly, they may think that if they don't assign a challenging deadline to a task, then you will sit around reading HN until the last minute and then furiously get the work done. In other words, they don't trust you to get the work done as fast as possible.

It's that trust issue that you need to address, not the date. If you focus on the date, then you are doing exactly the same thing that the PM is doing when they complain that you insist on refactoring code -- you are abandoning trust and trying to make a judgement call from the wrong place.

Unfortunately, in this situation there are probably a few things you need to do to solve the situation. As usual, there are many ways to skin a cat, but I will tell you the way that has worked for me in the past.

First, if you have deadlines on individual features, then you have a problem. If you have features that span large amounts of time (as in more than a day or two), then you have an even bigger problem. From what you are saying, these two things seem likely.

The strategy I would suggest is to temporarily acquiesce to the deadlines. Yes, technical debt, but it will pale in comparison to the debt you will acquire if you don't fix this problem.

Next, split up the features into smaller pieces. Each piece should be (on average) about 1 day of work. So if your feature is due in 10 days, then you should have 10 pieces to build. Do not organise these pieces by design structure. Instead organise them such that every day you accomplish something that you can literally show the PM (it won't necessarily be possible every time, but do your best). Very important: doing this will require a lot of rework because you will have to put temporary design scaffolding in to make something user-visible every day. Do not avoid this!!!!! I repeat: Do not avoid this!!!!! You will have to refactor as you go. This is not bad.

Depending on what kind of thing you are building and what your deployment procedure is, try to deploy not-finished, but still useful bits as often as possible. Every day would be best if you can manage it, but don't go more than 2 weeks without a customer visible deployment (2 weeks is a rule of thumb, but you should probably view it as a hard and fast rule at first and adjust later when you have a feel for what works best). This may require you to split up the feature into sub-features that are complete in and of themselves. This can be challenging from a design perspective.

Your goal here is 2 fold. First you are establishing credibility that you are working and delivering every day. Not only that, but if you miss the deadline, the PM has a very good idea of where you are with the feature. As you surmise, they probably don't care at all when the feature is done. They just care that you are working on it flat out. By delivering every day, you establish this fact.

Second, by deploying at least every 2 weeks, you are significantly reducing the pressure that the PM feels from other places. If you have an excellent PM, they will be insulating you from the white hot pressure that other business units are putting on your team. But even the best PM cracks and puts you under the vice.

Corporate attention span varies in different companies, but my 2 week rule of thumb has worked well for me in many different environments. It answers the "What have you done for me recently" question nicely. A stress free PM results in significantly more elbow room for you. Never underestimate this.

Now, it may be that you are already delivering dribs and drabs every day or two (because your PM is already pre-optimising your experience). If so you can probably go to step 2 faster. In this step you start negotiating to remove deadlines. Since you are already delivering every day and you are deploying every 2 weeks, negotiate the amount of work that you are going to deploy in the two weeks, while continuing to deliver every day.

So, if you have 10 reports to do, don't put a deadline on each report. Say that you will deploy all 10 reports in 2 weeks and that you will deliver functionality every day, just as you always have. Also negotiate that prioritisation happens every 2 weeks. So whenever they want something, they will have to wait (on average) one week before it is prioritised into the active development backlog. This will be a hard sell, but offer the addendum, "Of course if there is an emergency, we'll have to make adjustments, and I will leave the definition of 'emergency' up to you." (The general rule is that you remove a like amount of work from the backlog and replace it with the 'emergency'. If you have to stop work in progress, then it is lost and you emphasise that 'emergencies' have a cost).

It won't be smooth going all at once, but over time you should be able to negotiate a way that will work for your group.

Hope this helps! It doesn't always work out, but like I said: you always have the option to vote with your feet if you think that your partners just don't want to dance.

RandomOpinion · on Dec 7, 2016

Agile emphasizes value for the customer but who the customer is has a very broad definition. The end-user is first and foremost, of course, but the operations group who has to keep the service running and the development team who has to maintain it (i.e. you) and so forth are also considered to be customers. If your PM doesn't understand that, it's a failure on his/her part, not an issue with agile itself.

softawre · on Dec 7, 2016

So...learn to make your case more effectively. PMs are people, too.

delinka · on Dec 7, 2016

Do I really have to explain the backstory here on HN? OK, the case has been made. Repeatedly. By me, my team lead, the research intern who wrote the original code ... Project Management is in agreement. This Product Manager refuses to accept that we need to refactor. Doesn't want the spend the time to do it. "I understand technical debt, but..." "I understand sustainability, but..."

Please allow me to disagree with you that this is somehow my fault (which can apparently be corrected by my "learning to make my case more effectively.")

nitrogen · on Dec 7, 2016

I'm curious, have you put it into numbers? You can probably even make stuff up (AKA estimate), as long as it has a rational basis. Something like, "Over the next six months, not refactoring will cost us 450 hours of lost developer productivity, based on how much time was spent fighting tech debt over the last two weeks. The refactor will take six weeks and 300 hours of developer time. If we do not refactor, we will have 20% developer turnover within two years."

Edit: It's possible that your PM will then say, "This feature will earn $250k per week. Delaying six weeks will cost $1.5m. Over two years, 1800 hours of wasted developer time is ~$150k, and recruiting 20% new devs will cost $250k. Let's spend $400k in lost productivity to earn $1.5m in additional revenue."

dozzie · on Dec 7, 2016

> I'm curious, have you put it into numbers? You can probably even make stuff up (AKA estimate), as long as it has a rational basis.

Engineers and developers usually have a strong aversion to giving out numbers out of the ass, and you cannot give such an estimate in any other way (barring the cases of the most trivial patches).

taeric · on Dec 7, 2016

I disagree here. Engineers and developers are great at making up numbers. Consider how many times in this thread where people will refer to "9 times out of 10" or other such statements. (Oddly, I have noticed that most of the numbers engineers make up are of the "90% of the time" variety. As if spouting a loose statistic is fine, but a date?! How could we?)

I think the problem is that we often think giving a point estimate somehow has to be 100% accurate. Or, worse, we estimate only on the lines of code that will be created and forget that it is all of the lines of code discarded in the process that take most of the time.

So, no, refusing to give a date doesn't do service to this. If something is going to take a long time, talk to the reasons it will take time. But FFS, do not focus on some intrinsic quality of the existing code base that is meaningless. It is not a customer problem that there is cruft in the codebase that embarrasses you. It is a customer concern that the last X times a feature was added you spent Y hours dealing with high priority bugs that resulted from compromises.

And if you don't know X and Y, than you are worse than making shit up, you are losing credibility.

flukus · on Dec 7, 2016

> I think the problem is that we often think giving a point estimate somehow has to be 100% accurate.

That's a learned behavior from all the times that estimates become deadlines.

taeric · on Dec 7, 2016

Then the learned behavior should be to remember the reason the estimates were wrong last time, and to include those this time.

And no, I do not think it is that easy. The point is that refusing to speak simply robs the process of valuable input.

dragonwriter · on Dec 7, 2016

Often, in these cases, what technically skilled people fail to do is make the case in sufficiently precise dollars-and-cents terms (often, in part because they fail to realize how much business types--which Product Managers mostly are, in practice--respond to that kind of presentation, and also often because they are professionally inclined to not allow precision of their forecasts to exceed accuracy, whereas business types are often conditioned to both create and act on SWAGs with precise cost benefit estimates that, while they often have a post hoc rationalization, are essentially pulled out of the air.)

I'm not saying this is the issue in your case, but I've seen it a lot.

ScottBurson · on Dec 7, 2016

So there's no engineering manager empowered to override the Product Manager? That's a serious organizational problem.

The Product Manager and Director of Engineering should be peers, not one reporting to the other.

ArkyBeagle · on Dec 7, 2016

I don't think the example is about optimizing for developer satisfaction.

1) He changed a number. 2) There was ( apparently ) a very simple test for the effect of the change being correct. 3) This was high-priority.

drewg123 · on Dec 6, 2016

I think the fault lies in review procedures that require bringing the rest of the code up to standards if you just touch a tiny part of it. In a shop like this, these rules mean that nobody will ever touch legacy code if they can possibly help it.

Posibyte · on Dec 7, 2016

For the company I work in the daytime for and the company I moonlight for, this has been the case in both situations. The looming nature of legacy code, especially with prohibitively work-expensive policies lead to much more than bad programmer effectiveness. It lead to lots of extra lead-time on deployments, heavily decreased morale, and when we lost our two rockstar-20/30-year developers, turnover exploded.

There's more to it more often than not than just policies and the amount of legacy code, but even when we had the most supportive of managers, morale was very low over it. I definitely think companies, as they age and go from startup to a firmly placed company, should really take stock of their code assets and policies and really make sure they're kept up-to-standard yearly, and really look at every policy and make sure it's absolutely needed. It's a lot easier than taking a 1 line project on a 9 year old piece of code and turning it from a half-day full-review to a 6 day haltingly slow process.

jrs235 · on Dec 7, 2016

> this has been the case in both situations.

Can you please clarify what the case has been? Has it been the case that procedures that require bringing all code up to standards if just a tiny bit is touched? Or has it been the case that nobody will ever touch legacy code if they can possibly help it?

Thanks.

serge2k · on Dec 7, 2016

It's a good policy in general, but it needs to be flexible.

"Fix this unrelated stuff or no approval" needs to be overridden by "This is critical". "Fix this before you add this new feature" is pretty reasonable.

rietta · on Dec 7, 2016

But why was the policy implemented in the first place. Probably to counteract an earlier development error where managers insisted that every change was priority 1, hot fix now. So the process evolved to be resistant to that.

ryandrake · on Dec 7, 2016

Exactly. I've worked at places where "I'm the super senior manager, and I don't care about the process! Fix it now!" was acceptable, and at that point you might as well have no process. It inevitably turns into a train wreck.

betenoire · on Dec 6, 2016

or worse, work around the process to avoid it

antihero · on Dec 7, 2016

Couldn't changing a bunch of code to conform to new standards actually create far more bugs?

rietta · on Dec 6, 2016

That is a valid criticism, but this is basically institutionalized boy scouting, which is a good practice.

dpritchett · on Dec 7, 2016

"Leave it better than you found it" is one thing, but "perfect everything you touch or don't touch it at all" is another, vastly more expensive and high-friction thing.

x0x0 · on Dec 7, 2016

And hard rules like config files vs hard-coded constants reeks of fetishization of process and rules over judgement. If something has been modified once in 10 years, with no near-term modifications planned, hard coding it seems pretty reasonable.

softawre · on Dec 7, 2016

> institutionalized boy scouting, which is a good practice

You brought up the name for it. Okay. Now can we get back to the debate on whether it is a good practice?

These people are debating whether boy-scouting is 100% required every time, or whether there are critical issues that require making it sometimes-optional.

dragonwriter · on Dec 7, 2016

> whether there are critical issues that require making it sometimes-optional.

There are obviously issues where it would be better to skip it, but that's the wrong question. The question that needs asked is can the organization identify, adopt, and properly execute a system which identifies those cases reliably enough that the benefit of the true positives is not overridden by the costs of the false positives. And, IME, the places that tend to have inflexible policies like this are also places where the answer to that is "hell, no!". Which is demonstrated dramatically (and expensively) everytime someone with who fails to recognize the organizational capacity constraint gets into the right position of authority and tries to implement that kind of flexibility.

rqebmm · on Dec 6, 2016

I agree. Requiring things like stakeholder signoffs, code review, good test processes, and respect for team resource management is annoying, but it can also be incredibly important when you're trying to keep a big ship afloat.

Without those steps there's another article waiting to be written: "Rogue developer brings down business for six days with a seemingly innocuous 1-line change"

bunderbunder · on Dec 7, 2016

I've worked in two kinds of companies: Places where it's kind of a hassle to get code past a code review, and places where everyone's always having to put out fires all the time.

Of the two, I've had to unexpectedly work nights, weekends or even vacations at only one kind.

Guess which one I decided I'd rather work at.

user5994461 · on Dec 7, 2016

I don't get it. The two groups are not mutually exclusive.

It can be annoying to go past code reviews and yet there are fire everywhere all the time.

lisivka · on Dec 7, 2016

It's very unlikely. Peer-reviews are spreading information across team, so at least two team members are understanding every area of the code, so project manager can always add more members to urgent or problematic task quickly.

mdpopescu · on Dec 8, 2016

I have worked in one place where you couldn't get the code to pass without tests AND a code review.

One of the methods I found had a cyclomatic complexity of 200+ and a single test, verifying that the result was not null.

I was told "no, you can't fix that, the NY office wrote it and they'll get angry with us if we change their code and refuse to help".

Things are worse than you think :)

user5994461 · on Dec 7, 2016

Depends on how much people care. The code review could be half assed and not detect issues.

devdas · on Dec 7, 2016

IME, those were the same place.

serge2k · on Dec 7, 2016

Except the end result was the boss said DO IT NOW and it got done now.

This seems to be a pretty clear example of not making the process work for you. Unwritten policies. Having to escalate twice because of silly reasons. Hot fixes being miscategorized as features. Is a very urgent fix really the place to be fixing a bunch of other tech debt?

dreamfactory2 · on Dec 7, 2016

cost to business of occasional goof is less here than prohibitively high costs of making any change from these inappropriate and frankly insane processes (I've held management positions at such companies and easily recognise these kind of processes, in fact the post may have triggered some mild PTSD)

misterbowfinger · on Dec 6, 2016

This is one of those "all of software is a tradeoff" kind of thing. Here, the company optimized for safety, with the tradeoff being that it takes much longer to ship. Startups optimize for shipping, and consequently ship broken code way more often.

OP demonstrates the fatal flaw in having process, in that poorly created process can be crippling. In this case, it may have led to significantly less profits, where if they had a way to fast-track something immediately, they could have avoided the delayed change for something so simple.

Another flaw is that process, if it's really bad, almost encourages people to circumvent it. At large, slow companies, the devs who push the most code are not necessarily the best and the brightest but rather those who can bend process in their favor.

yongjik · on Dec 6, 2016

If we're dealing with a company-wide hard-coded variable that only changes when the company is in danger of layoff, I'm pretty sure that:

(1) Making the variable a configuration variable will do nothing: it's not like you want two different versions of company policies at the same time.

(2) It won't help with testing, because any realistic test of the variable's change will involve changing the company policy and probably the way all systems behave.

In my opinion, making that into a config variable is a mistake. It's perfectly OK to keep as a hard-coded constant.

mynegation · on Dec 7, 2016

It is not a mistake. One literal 3 is not distinguishable from another literal 3. If you change it in one place but forget about another - that can lead to the inconsistent behaviour. Configuration variables are much less error prone

yongjik · on Dec 7, 2016

Configuration variables make the code ever-so-slightly longer, and introduces additional failure points (what if the config file is missing or corrupted?) and considerations for cases that will never arise with hard-coded values (should we check that the limit is always positive? what if it is 10000 months, should we do something?).

Yes, in many cases they are an improvement, but not always. Every "best practice" must be judged individually with the question "does it apply here?".

Years ago I was working with some very small piece of a web-crawling framework, and I had to do something special for www.naver.com, Korea's biggest portal. Because the string "www.naver.com" appeared multiple times in the same function, we put that into a string constant at the beginning: something like

  static const string kNaverUrl = "www.naver.com";

Because having "magic constant" sprinkled in the code is bad!

...it was only much later that I realized the stupidity. As if Naver, Korea's #1 portal, will change its domain name at a whim! Even in the unlikely case that happens (say, they get acquired), so many other things will also change that my code will be surely unusable after that.

kNaverUrl was an unnecessary detour that only served to make the code longer and (slightly) less obvious, guarding against a theoretical possibility that will never come true.

Code Review has this nasty tendency of accumulating "best practices" beyond breaking point, because nobody objects to best practices. Once in a while, these practices turn out not to be that best.

Just my two cents.

dozzie · on Dec 7, 2016

I don't know how does your editor work, but my Vim diligently helps me with identifier completion, so kNaverUrl wins every time with "www.naver.com", even if it doesn't make the code more flexible.

It also defends against typos, as serge2k pointed.

serge2k · on Dec 7, 2016

It's also easier to refactor code without magic constants, as you have compile time bugs mispelling kNaverUrl instead of runtime bugs because "ww.naver.com" in one of 10 places.

yongjik · on Dec 7, 2016

You have a point, but on the other hand, it is much easier to write (kNaverUrl + "index.html") than to write "www.naver.comindex.html". So it just opens door for a different kind of bugs.

zepolen · on Dec 7, 2016

Then it becomes a function:

  kNaverUrl('index.html')

Multiple functions:

  kNaverWwwUrl('index.html')
  kNaverApiUrl('resource')

Then it get's refactored:

  urlGenerator('www.naver.com')('index.html')
  urlGenerator('api.naver.com')('resource')

Refactored even more:

  urlFor('naver_www', 'index.html')   
  urlFor('naver_api', 'resource')

Even more!

  urlGeneratorFactoryFactory(.......

jeroen · on Dec 7, 2016

> I had to do something special for www.naver.com, Korea's biggest portal. Because the string > "www.naver.com" appeared multiple times in the same function, we put that into a string > constant at the beginning: something like > static const string kNaverUrl = "www.naver.com"; > Because having "magic constant" sprinkled in the code is bad!

You have changed your magic string into a magic constant. It's a step towards better code, but the next step is to give the thing a proper name. In this case, something like DoSomethingSpecialUrl. Now you have a bit of self-documenting code.

hackflip · on Dec 7, 2016

> As if Naver, Korea's #1 portal, will change its domain name at a whim

Maybe they won't change their domain name, but what if you end up needing to move from http to https? Or what if you want to crawl their staging server at staging.naver.com? Or if you wanted to bypass the DNS?

Avoiding the magic constant achieves more than just guarding against name changes. It also guards against typos, you reduce n potential failure points to 1.

zelos · on Dec 7, 2016

static const string kNaverUrl = "www.naver.com";

That's such a common junior developer mistake. I see it a lot - constants called ONE_SECOND or COMMA.

hughw · on Dec 7, 2016

It was a hard-coded value before. The OP just wanted to leave it as a hard-coded value. The world would have been just as hard-coded before his change as after, and it would have been improved as little by the change. The policies in place seem to set the perfect against the good.

Gibbon1 · on Dec 7, 2016

My take on this is modifying the hard coded constant was not a functional change to the code base. Making it configurable was a functional change. One shouldn't make functional changes to old battle tested code unless you have to. If you do that it's just risk.

douche · on Dec 7, 2016

Configuration variables are, in my experience, considerably more error prone. If something is in a configuration file, some day, somebody is going to want to configure it, not knowing how it works, what the valid values are, or what it even does. Moving something to a configuration variable means you have to do all the work to sanitize that input and defend against invalid values.

Hearing "Oh, make setting XYZ configurable" or "Stick a flag to turn FooBar'ing on or off" also fills me with worry, as all too often it has been the harbinger of terrible feature creep and piling on of technical debt.

quelltext · on Dec 7, 2016

Nobody set it needs to be a magic number. A descriptive constant name will address that. However, a setting in a config file invites tampering, which in turn means more complicated tests to account for that. It might make sense if frequent changes are anticipated.

However, not every hard coded constant should or needs to be in a config file.

mnarayan01 · on Dec 7, 2016

My impression was that "hard coded variable" meant a named constant, i.e. something that was not run-time configurable.

r00fus · on Dec 7, 2016

So make it a CONST.

chuckcode · on Dec 7, 2016

The pain for me here isn't that process can take some extra time, it is that the developer doesn't know what is expected and keeps getting surprised at every step. It can soak up a lot of calendar time and developer motivation to have a code change swatted back and forth as different people weigh in with various undocumented guidelines/policies/standards over a period of days or even weeks. This could have been a very different experience if developer had known what was expected from the start.

Ideally I'd like to see code reviews culturally in an organization be mostly about catching bugs and collaboratively learning the craft of coding better. It can be very frustrating and demotivating when it devolves into a series of undocumented style and naming requirements. I think it is good to have standards and process but seems like most places I've worked are woefully short on documenting these hidden requirements.

nottrobin · on Dec 7, 2016

I think there are certain things here that should go and things that should stay.

Reviewers should not be afraid to reject code, changes should be tested.

However, there are both bureaucracy and code reviewing things that I would change here to not slow things down unnecessarily:

- If the code practices document hasn't been updated, pending future changes shouldn't block your PR

- You should never have any trouble finding someone to review your changes - the whole team should be empowered to review.

- Strictly, only the lines you touch should need to follow any existing code practices, not the whole module because you happened to change one line. Otherwise people will be scared to do small fixes.

- You should only be forced to write new tests for functionality you've actually added, for the same reason. As well as of course fixing tests you break with your code.

- Don't even get me started on Ed not having access to Marge...

- Why does this piece of work need to be done in the first place? Why would a backlog be limited to 3 months?

TL;DR: There should only be as much bureaucracy & strict rules as strictly necessary. Unnecessary bureaucracy is the death of companies, large and small.

anon1209 · on Dec 9, 2016

>- Why does this piece of work need to be done in the first place? Why would a backlog be limited to 3 months?

If it's a literal factory they may want minimize storage costs. You have to rent space to keep the product until the customer expects it delivered. I'm a layman so take my guess with a grain of salt.

KKKKkkkk1 · on Dec 6, 2016

It's nice to reject code, but code reviewers must also be mindful of opportunity costs when wasting developer time on trivial concerns. (Being the cynic that I am, I guess at some point Corporate America will introduce code review reviews to solve this problem.)

softawre · on Dec 7, 2016

> Code review reviews

Good tech managers do this today. I do. My direct reports that give good code reviews are more valuable to me, so I need to know how well they are reviewing code.

r00fus · on Dec 7, 2016

Exactly. It's a metamod process.

rietta · on Dec 6, 2016

Amen! For an enterprise system that doesn't have end-to-end automated tests, this sounds like a sane organization running as best it can. The process is optimized to not let incomplete code make it into production. This is a company doing it right given what they have to start with. Changing that ship won't happen overnight.

bhntr3 · on Dec 7, 2016

But will anyone change it again? How many times would they have to change it for this to be better? Maybe the developer should build a UI so the business owners can change it themselves. After all, business people editing config files directly on production machines is dangerous. Clearly the conf file should be in source control and subject to code review. And so on and so forth. Reasonable changes accrete.

The real question is: What is this policy and process protecting against? Does it work?

We're all risk averse. Bad things happen. We come up with processes and systems to prevent them. We write our postmortems and say "How could this have been prevented?" and we say "Heavyweight process X and automated, difficult to maintain, high overhead system Y" and everyone nods their head sagely and makes plans to implement X and Y. Then we make a new mistake and postmortem and build process Z and system W.

But you know that old saying "Lightning never strikes the same place twice"? Well, it's wrong. Sometimes it does. Especially over a long enough timeline. But it's pretty rare. We're the soldiers preparing to fight the last war and losing the next one.

Personally, when I see stuff like this I always think of https://medium.com/@webseanhickey/the-evolution-of-a-softwar...

It's a cartoon. But I'm getting old and angry. 15 years in, I think that most of this stuff that we do, these pedantic code reviews, processes, policies, best practices and test plans we're always beating each other over the head with, they don't always pay for themselves. We still write bad software. We still make stupid human mistakes.

There is no answer but good judgment, trust and respect. You need as much of it as possible. I think it's easier if you believe Clayton Christensen and build protected small teams that can operate in a limited bureaucracy and limit the connectivity between them and groups viewing them as "others". But again, trust, respect, autonomy and good judgment.

Or just make shittons of money. With lots of money, you can afford to hire lots of smart people to do stupid stuff and take forever doing it.

whack · on Dec 7, 2016

Here's some.

David/Julie: Having processes that allow critical issues to sit idle in a backlog for 2 days.

Shirley: Delaying a code change because of completely unrelated issues. Code changes that introduce bad code should be fixed first. But demanding that the person fix all unrelated issues in the file is not the way to go.

Tony: Not having tools in place for quick, completely automated testing. If a one line change requires 3 days of manual testing, something is wrong.

pbh101 · on Dec 6, 2016

For established businesses, I wonder if you could come up with a useful metric for how much value you add by rejecting a commit and thus avoiding an operational incident.

Mature, high-profit systems warrant much greater scrutiny on incoming changes than businesses/systems which aren't spinning off money yet. You can err in both ways: too slow where you need to be fast, and too fast where you need to be more careful. And AFAICT companies/organizations build processes, habits, and platforms around one mode and have a hard time switching to the other.

user5994461 · on Dec 7, 2016

Yes. Here, every second of downtime is over $10, if it affects all users.

Last time we had a partial outage that only affected a piece of the base for a few hours, we got a first estimate at over 4 million dollar direct loss of revenues, and some percentage of that as direct guaranteed profit loss.

snovv_crash · on Dec 8, 2016

So when stackoverflow or github goes down, and they trace the bug to some legacy component, they should refactor the entire thing, taking a week, instead of increasing the memory on their VM, bypassing the release process, then doing full postmortem once the fire is out?

This isn't dev. Running a factory is ops. It can't take 6 days to renew your domain name when you realise it expires in 2. It can't take 6 days to patch your server when some script kiddie takes it down the first time. Not everyone has a job that gives them the luxury of making sure everything is perfect before they commit to anything.

hvidgaard · on Dec 7, 2016

There are two paths issues can take. One is the regular one, that follow procedures for good reason. The other is the fast one, this is the things where time is of the essence, e.g. a bug that takes the entire system down so noone can do anything - issues like this must be solved asap, and the codemess, edge-edge-case testing, and documentation will have to wait until later (but will be top priority). This change sounded like it was the latter, and putting it through 6 days wait to get something out that needs to be changed yesterday is too slow.

nnq · on Dec 7, 2016

The fault is that they're missing the "fuck the process, let's get this done by yesterday morning or we're all gonna die" button.

(Of course, the rule for this button when you have is that you can only press it like once per month or some other arbitrary long time. Oh, and maybe having a physical button that actually sets off an alarm in the office could help too. Maximum annoyance to prevent abuse.)

mannykannot · on Dec 7, 2016

I realized this as I was writing another reply, but it can be faulted for forcing a system upgrade onto an independent issue. Separation of concerns, people! Messing with legacy code is not a trivial matter, and if you are going to do it, make it its own project.

paulddraper · on Dec 7, 2016

What about

> Also, there are 2 old Debug commands, an unassigned variable warning message, and a hard-coded Employee ID that will all have to be fixed before this module can be moved to production.

F* that s* indeed.

Clubber · on Dec 7, 2016

This sounds like a place programmers go to die.

vacri · on Dec 6, 2016

I'll go further and say that this is what people say they want: time to refactor out tech debt. The headline is a lie - there was tech debt that was company policy to take time to fix (parametising some hardcoded variables, amongst other things). "mission critical LoC" is not the same as "LoC" in their little summary.

People beg for time to fix tech debt, and right here is someone complaining about having to do just that.

x0x0 · on Dec 7, 2016

As I read it, he's complaining about (1) gating high-priority releases on tech debt (priority inversion); (2) the fetishization of rules over judgement (no hard-coded variables that have been changed once in 10 years with no near-term updates planned); and (3) the systemic lack of urgency.

vacri · on Dec 7, 2016

Yeah, but this is "top of the queue" priority, not "must get done today" priority. The managers were talking about lead times and the potential of laying someone off, not "if this doesn't get fixed today, someone is going home hungry!". There's no mention anywhere in the story of someone actually at risk of being let go due to the delay.

I've also worked support at a startup that was all about the whole "we don't really need tests for trivial changes!" factor. Oh, what fun it was when the devs would push straight to production at 1am, with an "ah, it'll get tested in the morning sometime, maybe"... in the meantime our farmer clients were using the tool well before office hours to plan their days around irrigation requirements. Those changes that devs thought so trivial were usually trivial, but sometimes were quite catastrophic due to unintended interactions.

x0x0 · on Dec 7, 2016

As per the article, there were both unit tests and a pre/post test run initially provided for this change.

ztratar · on Dec 6, 2016

This is why startups kill big companies.

Lots of stupid little processes that add up and make the managers feel power, but end up killing the performance of the business.

Brings up the interesting differences between leadership and management. Managing is what gets people into situations like this, whereas leadership (in late stage companies) is what gets people questioning these processes instead of moving like cattle into a slaughter.

I saw this exact type of behavior at General Electric back in 2011. People wanted to argue about acronyms and process design all day, whilst Samsung ate up 20% market share in 1 year. That business of GE no longer exists and was sold to a foreign buyer. Move fast or die -- this rule does not only apply to startups!

Johnny555 · on Dec 7, 2016

It's easy for startups to be nimble because they have no legacy processes to maintain.

The startup may have 5 guys in a warehouse building their product, while the big company has 100 people on a $10M assembly line.

When the startup wants to scale by 20%, they work longer hours or add another worker. If they didn't plan properly and find that they don't have another workstation for that worker, they buy a table at OfficeDepot and there's their new workstation.

The assembly line might need an entire process flow redesign to add more workstations, plus a physical reconfiguration, software changes, etc.

If a developer just blindly adds a parameter that adds 10% to the workday without going through reviews and tests, he may find out from the physical plant engineers that they no longer have enough time to do preventative maintenance so they need to scheduled an outage to do deferred maintenance.

There's a reason why process exists in large companies, and it's not just for the sake of process.

That small startup might be able to usurp the large corporation, but the large company still has an advantage in throughput and/or efficiency. The corporation can promise Walmart that they'll have 100,000 units delivered by Dec 1st... to meet the same goal, the small startup might have to scale their workflow 100X, and they can't do that by scaling from 5 guys in a warehouse to 500 guys in a warehouse.

tonyedgecombe · on Dec 7, 2016

I don't know, for every justifiable decision I've seen in corporates there have been ten where it was all about politics, needless bureaucracy, incompetence and power and nothing to do with getting things done.

mevile · on Dec 6, 2016

> whilst Samsung ate up 20% market share in 1 year

Right a 78 year old company with hundreds of billions of dollars of revenue is a startup killing big companies.

I'll just note that while Samsung is about half of G&E's age it makes twice as much in revenue.

gamegoblin · on Dec 7, 2016

The comment you are replying to included:

    Move fast or die -- this rule does not only apply to startups!

So I think they are aware that Samsung is not a startup.

taneq · on Dec 7, 2016

Apparently "startup" doesn't mean "company which has just started running" these days, it means "company which is growing rapidly."

ap22213 · on Dec 7, 2016

I don't know. It seems more like 99% of startups die, and 99% of large companies stick around. Not that I enjoy working for large companies, but just saying...

TheOtherHobbes · on Dec 7, 2016

Large companies stick around because they've solved the one problem all startups have - finding and keeping customers and turning them into revenue through sales and product pipelines.

Code is pretty much a footnote to that process. It's exactly one of very many tools a business can use.

It's true that technical debt can cause business problems. But it doesn't cause nearly as many problems as having an ineffective sales team, getting killed by regulation, or being eaten alive by a competitor.

_xvdg · on Dec 7, 2016

Right. For every startup that skips process and gets lucky there is a pile that rolled snake eyes and died.

Big companies aren't taking the chance.

flukus · on Dec 7, 2016

Big companies tend to not take that chance even when it's worth taking though. Not everything is mission critical but the procedures around them pretend they are.

B1FF_PSUVM · on Dec 7, 2016

> 99% of startups die

And no one bothers suing into the ground a muppet with two packs of ramen to their name ...