Our metric is approximately "hours of work for an expert engineer." Here are som...

jumploops · 2024-11-20T20:59:39 1732136379

Curious how these numbers correlate to the estimates of the engineers behind the PRs?

For example, the first PR is correlated with ~15 "hours of work for an expert engineer"

Looking at the PR, it was opened on Sept 18th and merged on Oct 2nd. That's two weeks, or 10 working days, later.

Between the initial code, the follow up PR feedback, and merging with upstream (8 times), I would wager that this took longer than 15 hours of work on the part of the author.

It doesn't _really_ matter, as long as the metrics are proportional, but it may be better to refer to them as isolated complexity hours, as context-switching doesn't seem to be properly accounted for.

adchurch · 2024-11-20T21:56:21 1732139781

Yeah maybe "expert engineer" is the wrong framing and it should be "oracle engineer" instead - you're right that we're not accounting for context switching (which, to be fair, is not really productive right?)

However ultimately the meaning isn't the absolute number but rather the relative difference (e.g. from PR to PR, or from team to team) - that's why we show industry benchmarks and make it easy to compare across teams!

henning · 2024-11-20T20:33:10 1732134790

That assumes all or almost all the work is writing the code, with no time allotted to actually using the app with that code written, benchmarking or other measurements, research about possible alternatives, etc.

adchurch · 2024-11-20T21:59:39 1732139979

Not at all! The algorithm is calibrated with real human effort. So find/replacing something 1000 times will have nowhere near the same value as adding 1000 lines of new code. And given 1000 lines of new code, you'll get the same value for implementing the same functionality in 100 lines instead.

What we don't capture is any product or communication overhead - however our platform has other metrics which can help find if these are causing inefficiencies :)

henning · 2024-11-21T18:05:30 1732212330

In a complex, mature system, a high impact bug could have a very small fix that is highly non-obvious. Your metric assumes that the person shitting out 1000 lines of a new feature no one wants is equally as productive as a distributed systems wizard who can fix bugs no one else can figure out adding a 3 line fix for an issue that customers have been complaining about for years. It is inherently biased towards adding new features and against maintenance and system quality improvement.