Identifying a representative usage scenario to optimize towards and then impleme...

bgirard · 2025-07-24T19:11:56 1753384316

> Identifying a representative usage scenario to optimize towards and then implementing that scenario in a microbenchmark test driver are both massively difficult to get right, and a "failure" in this regard, as the author found, can be hard to detect before you sink a lot of time into it.

I can vouch for this. I've been doing web performance for nearly 15 years and finding clear representative problems is one the hardest parts of my job. Once I understood this and worked on getting better at finding representative issues, it became the single thing that I did to boost my productivity the most.

jasonthorsness · 2025-07-24T19:18:37 1753384717

What have you found are the most useful approaches to collecting the representative issues in a way you can reuse and test? I haven’t worked as much in the web space.

bgirard · 2025-07-24T19:37:04 1753385824

A combination of: 1) RUM, 2) Field tracing, 3) Local tracing with throttling (CPU, Network), 4) Focusing either on known problems, or if I'm on less certain territory I will do more vetting to make sure I'm chasing something real. I'll minimize the time spent on issues that I can only catch in one trace, but sometimes it's also all you get so I'll time box that work carefully. It's more of an art than a science.

viraptor · 2025-07-24T19:17:59 1753384679

Tracing and continuous profiling made this task significantly easier than it was in the past, fortunately.

bgirard · 2025-07-24T19:32:43 1753385563

It's great. Much harder on Web because it's difficult to get rich information from the browser. This is why I contributed to the JS Self-Profiling API in the first place: https://developer.mozilla.org/en-US/docs/Web/API/JS_Self-Pro...

CodesInChaos · 2025-07-24T21:46:08 1753393568

Can your recommend tools for continuous profiling? I'm mainly interested in Rust and C# on Linux.

I'm not sure if it has gotten easier. Async is harder to profile than classic threaded code using blocking IO. And for database bound code, I'm not sure which databases output detailed enough performance information.

viraptor · 2025-07-24T21:54:19 1753394059

https://grafana.com/products/cloud/profiles-for-continuous-p... can be self hosted as https://grafana.com/oss/pyroscope/

And there's also https://docs.datadoghq.com/profiler/

sfink · 2025-07-24T20:22:05 1753388525

It's a subset of profile-guided optimization, but I fairly frequently will gather summary histograms from more or less representative cases and then use those to validate my assumptions. (And once in a while, write a benchmark that produces a similar distribution, but honestly it usually comes down to "I think this case is really rare" and checking whether it is indeed rare.)

anonymoushn · 2025-07-24T18:49:41 1753382981

I have asked for like the ability to apply a filter that maps real strings from real users to a very small number of character classes, so that I may run some codec on data that is equivalent to user data for the purposes of the codec. I have heard, this is not a good use of my time, and if I really care so much I should instead make various guesses about the production data distribution of user-created strings (that need to be json-escaped) and then we can deploy each of them and keep the best, if we can even tell the difference.

jasonthorsness · 2025-07-24T19:10:37 1753384237

If the real data is sensitive, it's hard to distill test data from it that succeeds in fully removing the sensitivity but that is still useful. Depending on the domain even median string length could be sensitive.

kazinator · 2025-07-24T20:56:38 1753390598

If they were forking that JVM in order to offer it as a development tool to a broad developer base, then such an optimization might be worth keeping.

Someone out there might have an application in which they are serializing large numbers of large integers.

The case for abandoning it is not as strong, since you don't know who is doing what.

It's a guessing game in programming language run-times and compilers: "will anyone need this improvement?" You have room to invent "yes" vibes. :)

grogers · 2025-07-24T22:41:36 1753396896

If the distribution of numbers you are serializing is uniformly random, you generally wouldn't choose a variable length encoding. Sometimes the choice is made for you of course, but not usually.

necovek · 2025-07-25T04:11:23 1753416683

The problem is that this is a complex assembly implementation that needs to be maintained over the course of decades in the future. Unless you have a good reason to keep it, you should drop it.

spott · 2025-07-25T01:48:23 1753408103

Why is just capturing real values not the answer here?

Something like grab .01% of the inputs to that function for a day or something like that (maybe a lower fraction over a week or month).

Is the cost of grabbing this that high?

adrianN · 2025-07-25T03:42:55 1753414975

That works for some cases, but if you have a service that processes customer data you probably don’t want to make copies without consulting with legal.