Uber open-sources tool to automatically clean up stale code

tjalfi · on June 21, 2020

Pirahna, Uber's code cleanup tool, was previously discussed at https://news.ycombinator.com/item?id=23516823.

The paper Pirahna: Reducing Feature Flag Debt at Uber is at https://github.com/uber/piranha/blob/master/report.pdf.

burtonator · on June 21, 2020

We just wrote one that does something similar for Typescript if anyone wants us to OSS it... The idea is that any stale code causes a HUGE amount of headache and removing it can be a life safer.

mkr-plse · on June 22, 2020

Piranha author here -- Will you be willing to contribute it to Piranha?

tomkinson · on June 22, 2020

@burtonator yes curious what you have or maybe it's a better fit under Piranha?

oars · on June 22, 2020

What does OSS stand for? I know the OS stands for Open Source, but can't quite figure out how this would be used as a verb...

jdxcode · on June 22, 2020

it stands for "open-source software". It's never abbreviated as "OS" for obvious reasons.

oars · on June 22, 2020

Thanks, I guess I'll add "Open Source Software" to my list of unusual yet widely understood verbs.

The English language confuses a native German speaker once again!

evv · on June 22, 2020

To be fair, I am a native english speaker and this is the first time I've seen it used as a verb. "OSS" generally acts as a noun. In this instance I would have used "open source" as the verb of what you could do to your proprietary Software, after which it would become OSS.

jdxcode · on June 22, 2020

Other commenters saying it isn't used as a verb are wrong. They're correct that people never say "I open-source softwared my last project."

However you can definitely use "open-source" as a verb: "I open-sourced my last project." It's pretty common. And in this case "OSS" stands for "open-source", not "open-source software".

What's confusing you is that when we use the abbreviation "OSS" we're really intended people to read that as "open-source" even though the abbreviation technically means "open-source software". btw, you would never verbally spell out "OSS"—chiefly because it's the same number of syllables as "open-source".

playpause · on June 22, 2020

I wouldn’t add that to your list of widely understood verbs based on a single HN comment. Any noun can be (mis)used as a verb in modern English, if speaking very informally, and often with a slight tongue-in-cheek humour about the clunky incorrectness of it. Some examples eventually become mainstream (to google something, to text someone, to roadmap it).

sushid · on June 22, 2020

I wouldn't call it unusual although I hear FOSS more often.

ygra · on June 22, 2020

As a noun I'd agree on it not being unusual, but as a verb it's weird.

stuckinaloop · on June 21, 2020

I you were to OSS it I would definitely use it!

zacksinclair · on June 22, 2020

I'd run something like that over my TS repos for sure!

tough · on June 22, 2020

Can't wait to give it a try

Coxa · on June 22, 2020

southpolesteve · on June 22, 2020

Please do!

squillful · on June 22, 2020

A similar tool, Vulture, exists for Python: https://github.com/jendrikseipp/vulture

I haven't used it yet myself, but discovered it during a search inspired by this post and thought it would be worth sharing. Definitely a trickier problem to solve for dynamic languages, but looks useful.

erik_seaberg · on June 21, 2020

I'm surprised to see that Boolean feature flags are common. We almost always ramp up percentages of UUID space or randomly chosen requests, to reduce the blast radius of a bad change.

cdcarter · on June 21, 2020

At $JOB, we have feature flags of innumerable shapes and sizes. Some are based on account standing, some are % gradual rollout at random, others are a more thoughtful low-risk to high-risk rollout across customers and hosts. Some are manual flipped per customer/only on certain dev hosts. Literally anything you can think of, we have tied behavior to it.

But, we've got good frameworks in place such that at the call sites where behavior diverges, it's just checking a boolean.

if PermissionController.get().get(MyPerm.class): doA() else: doB().

I suspect this is pretty common, and its still easy to do the dead code elimination on.

fennecfoxen · on June 21, 2020

> if PermissionController.get().get(MyPerm.class): doA() else: doB().

oh, but if a behavior changes when a feature flag is active, there's a very strong case for it to be pluggable behavior strategy, so I like these so much better as an unconditional call to `self.getThingStrategy().execute()`

anamexis · on June 21, 2020

Won't `getThingStrategy()` do something like `if PermissionController.get().get(MyPerm.class): doA() else: doB()` ?

rlayton2 · on June 21, 2020

Yes, but you move the logic for deciding which branch to take to an underlying function. It pollutes the parent function less, but results in more overall functions for places to hide.

Also, if the logic gets more complicated than just an if statement (if Permision... and date < cutoffdate: etc) you don't further pollute the parent function.

fennecfoxen · on June 21, 2020

Naah, but at least that would make the decision simple and clear (if condition return doA else return doB).

Better would be for whatever ThingFactory or getThingService instantiates the Thing to make the decision, and compute it up front.

If-else statements in application logic tangle the concepts of "what should be done and why?" with "let me do this Way 1" and "let me do this Way 2". Ideally a typical service (or model or similar) shouldn't be aware that "feature flags" as a concept exist, and this should be regarded as inimical to their encapsulation. It should just know that it delegates a decision to Way N.

vanviegen · on June 22, 2020

That sounds a lot like enterprisy Java programming. I'm not always thrilled to work with the results of that. :-)

novok · on June 21, 2020

Many feature flag systems ramp up doing that from the config server side. So when your client requests feature flags, it gets assigned true/false on that basis.

solidasparagus · on June 21, 2020

How do you test without a feature flag?

erik_seaberg · on June 21, 2020

The flag becomes a percentage. The feature isn't always off or always on, it's on for a small fraction of workload, and then that fraction grows as you demonstrate it's safe. Ideally you want metrics that tell you whether the experiment is worse or better than the control group.

wahlrus · on June 21, 2020

What you're describing is a common way of using feature flags—except the percentage part comes from how you manage the servers running the binary with config. I.e. on day one, 5% of servers in cluster get True for the flag value. The double the percentage every day until 100% or otherwise rollback if it's a bad cut.

Twirrim · on June 22, 2020

Then rolling forwards and backwards is a whole deployment away, or mucking about with infrastructure, vs tweaking a percentage flag somewhere.

If you want to get fancy with changes (and I've seen it done) you have something else capable of controlling that percentage setting that is tied in to your monitoring. Start out low, say 1% of requests hitting the new path. Automatically ramp up over time to full 100%. If you see failures, automatically drop back to 0% until it can be ascertained that the failure didn't come from the new code.

erik_seaberg · on June 22, 2020

Partitioning by instance works if you have enough instances to avoid big increases, but at that point you can just deploy known-good and new-feature builds. Runtime checking helps if it's a lot faster than rolling back to the known-good build, or if you're doing concurrent experiments (you may not have enough instances to try every possible combination).

RSZC · on June 22, 2020

Having done it both ways this would not be my recommendation unless it's necessary - I think it adds a fair amount of complexity.

Some considerations: you'll need some sort of storage mechanism for these flags - is that a centralized configuration service for all your services? Maybe just a table in your database? But database / network calls are expensive to be adding to every single time your code executes the path in question - maybe it makes sense for your service to cache these values locally...but then doesn't that lose part of the purpose of 'fast rollbacks'? Maybe instead of a local cache you spin up a redis instance - but what if this goes down? Will all your instances default to the same value? Etc, etc, etc.

I'm not saying this approach is bad, only that it has complexity, and I find I generally can get away without it.

solidasparagus · on June 21, 2020

But how do you test the feature without a boolean flag that you can set to enable the feature?

hoorayimhelping · on June 21, 2020

I think it might be less confusing to say, how do you verify the feature? As in: how do engineers and product managers and designers know that the flagged behavior is correct and how do you verify that in production if all you do is ramp up? How do you make sure the interested parties are always bucketed into the on experiment?

foota · on June 21, 2020

You mean unit testing? You can add a way in your framework to force the flag on.

Thaxll · on June 21, 2020

This is not feature flag then it's A/B testing.

therealdrag0 · on June 22, 2020

The latest place I work has a dusty monolith full of skeletons and zombies. They’ve been migrating off it to micro services for like 5 years now.

I’d be nice to have a tool to clean it up. But the use cases is slightly different than this one.

mkr-plse · on June 22, 2020

Can you elaborate further? It will be interesting to see other use cases for code cleanup.

bartkappenburg · on June 22, 2020

I like the concept using an analyzer like this but I've been looking for a tool that removes code that isn't perse stale but isn't executed for, say, x amount of time.

We have quite a few online services in production of which I'm sure have a lot of code that isn't executed/touched by our users.

As an example I'm thinking about code in if statements that are never reached because the statement always returns False. These tools discussed aren't capable of detecting such stale code.

Any suggestions?

mkr-plse · on June 22, 2020

Do you already know the code that is not executed? If not, dynamic program analyses can help with identify the regions of code that are untouched (e.g., take a look at javaagent for Java programs). Subsequently, you can use some form of reachability analysis to determine which code blocks to delete without causing a compilation failure.

creyes · on June 22, 2020

Is this drastically different than https://unused.codes/ ?

dahfizz · on June 22, 2020

From a quick reading, yes. Your tool identifies code that is never run. The featured tool identifies code that never _will be_ run if we make X change to the API.

creyes · on June 22, 2020

oh interesting, tbanks for the clarification!

mkr-plse · on June 22, 2020

For Piranha: a) Code related to stale flags is deleted b) Determination of staleness is based on status of the feature. c) Patch is created automatically and in a majority of the cases, compiles and passes tests.

Based on my understanding of unused codes, a) unused codes is used to delete deadcode independent of features b) determining the deadcode is based on their usage in tests c) unclear whether the code is flagged for deletion or a patch created.

W4ldi · on June 21, 2020

lol, I just thought about writing a script that does excactly that for javascript code. For easy implementation I thought about using annotations/comments that tell the script what to delete once a feature switch is being deleted.

joering2 · on June 21, 2020

Total nit pick, but I think Piranha is a totally wrong name here. Piranhas mostly feed on fresh meat, its known they are excited and eager to attack when victim moves in water, reassuring predator that its a healthy meat. Much better name would be Scavenger, or Vulture - a bird of prey that eats on dead meat.

I assume you would rather cleanup your program from dead code, rather than strip it down to the bone from live functions :)

snthd · on June 21, 2020

https://en.wikipedia.org/wiki/Maggot_therapy might be a better analogy - you don't want to lose everything, just to clean out the necrotic bits.

detaro · on June 21, 2020

As far as I know that aspect of Piranhas is overhyped in popular culture, and they're pretty much omnivores, hunting fresh prey, scavenging on carcasses and eating plant matter. Agreed though that naming it after a pure scavenger would make more sense.

Kaze404 · on June 21, 2020

Wikipedia (and the sources listed) contradict this comment, saying Piranhas are vicious carnivores, with some even exhibiting cannibal behavior.

https://pt.m.wikipedia.org/wiki/Piranha

judge2020 · on June 21, 2020

English: https://en.wikipedia.org/wiki/Piranha

> Although generally described as highly predatory and primarily feeding on fish, piranha diets vary extensively, leading to their classification as omnivorous. In addition to fish (occasionally even their own species), documented food items for piranhas include other vertebrates (mammals, birds, reptiles), invertebrates (insects, crustaceans), fruits, seeds, leaves and detritus. The diet often shifts with age and size.

> In another study of more than 250 Serrasalmus rhombeus at Ji-Paraná (Machado) River, 75% to 81% (depending on season) of the stomach content was fish, but about 10% was fruits or seeds

throeawayjoe · on June 21, 2020

Piranhas are scavengers.

The typical diet of red-bellied piranhas includes insects, worms, crustaceans, and fish.[13] In packs up to hundreds, piranhas have been known to feed on animals as large as egrets or capybara. Despite the piranha's reputation as a dangerous carnivore, it is actually primarily a scavenger and forager, and will mainly eat plants and insects during the rainy season when food is abundant. ~ Wikipedia.

alextheparrot · on June 21, 2020

Could also change it to ¬Piranha

stevefan1999 · on June 22, 2020

/jokes

How does Pirahna clean itself?

laydn · on June 21, 2020

I find it strange that a company which provides taxi hailing/routing technology felt the need to write a code cleanup tool.

dang · on June 22, 2020

"Please don't post shallow dismissals, especially of other people's work. A good critical comment teaches us something."

https://news.ycombinator.com/newsguidelines.html

malux85 · on June 21, 2020

A company with billions in revenue, live updates routing around traffic, has complex machine learning modes predicting ride times and costs, and operates in something like 65 countries.

Uber isn’t trivial

zaphirplane · on June 21, 2020

I didn’t know they did their own navigation, I assumed they would use one of the map services

graeme · on June 21, 2020

Those services charge corporate users. If uber develops nothing homegrown then it is subject to the whims of the major map services, mostly google.

Most drivers use a 3rd party app in practice but uber probably needs to run mapping to avoid a source of weakness/cost.

(Could be totally wrong about incentive structure)

sverhagen · on June 22, 2020

I thought between Google, VLS, Tom Tom, and I can't imagine that I just coughed up a complete list, there'd be enough competition to just procure this function competitively. I assumed it was more the Not Invented Here syndrome of a VC funded company who (until recently) had little of a cap on their Engineering spend.

read_if_gay_ · on June 21, 2020

You can make most of the Big N sound silly like this. Amazon? Basically a warehouse. Netflix, YouTube? They just stream video. Facebook, Twitter? CRUD websites.

It all sounds like anyone can put something like that together, but try scaling up to billions of users.

jerzyt · on June 22, 2020

Exactly. And each one of these companies had built the business around a key idea. I interview a lot of candidates for data scientists, and one of my favorite questions is what makes one of these companies what they are.

speedgoose · on June 21, 2020

For sure if you remove the scalability challenges and profits optimizations techniques, these websites aren't that exciting for engineers.

argonix · on June 21, 2020

Uber's article [1] linked on the page mentions it at the start:

> These nonfunctional feature flags represent technical debt, making it difficult for developers to work on the codebase, and can bloat our apps, requiring unnecessary operations that impact performance for the end user and potentially impact overall app reliability.

> Removing this debt can be time-intensive for our engineers, preventing them from working on newer features.

[1] https://eng.uber.com/piranha/