Hacker Newsnew | past | comments | ask | show | jobs | submit | omazurov's commentslogin

No mature, sensible code allows for 10x performance improvement (let alone...). Every time I see a statement like this I take it as a confession.


I mean, in this situation a big thing is taking a plugin and re-writing it as part of the core application. The initial plugin was inherently limited by the existing plugin API, which wasn't optimized for this specific use case.

"We improved a popular plugin by integrating it with our core application" doesn't feel like a confession to me. In fact, that's exactly what I'd like to see from an application developer. No application is perfect, and it's good when developers improve the basic look and feel of an application based on what their users want.


What if after a long research, you can change the complexity of an algorithm from O(N) to O(log N), which makes the code orders of magnitude faster on large datasets. Does it mean the original code was not "sensible code"?


I know reading the article is hard, but if you did you'd realize that it was slow because colorization was performed by an extension, and could not be optimized sufficiently because of limitations of the public API. They sped it up by moving it into the VS code core, which allowed it to take advantage of a bunch of features not available to extensions.


You didn't even bother reading the article, did you? The performance increase was achieved by inlining a third-party extension into VSCode itself.


The code was both mature and sensible _within the limits of the environment and API available to it_.

The VSCode team took it out of that environment completely, which opened up a huge number of optimization possibilities.


My personal recommendation for a Tarkovsky's first is his diploma film The Steamroller and the Violin (1960, 46 min, co-written with Andrei Konchalovsky). It was surprisingly watchable [for me] and I wish I had watched it first myself before I was exposed to the Mirror when I was 16.

[0] https://en.wikipedia.org/wiki/The_Steamroller_and_the_Violin



You get JFR's precision but inherit its blind spots as described in [1]:

> demo1: ... JFR will not report anything useful at all, since it cannot traverse stack traces when JVM is running System.arraycopy()

I'd rather run jstack in a loop than lose System.arraycopy() (or, in fact, any native code be it JVM's or JNI).

[1] https://github.com/apangin/java-profiling-presentation


I would rather use Andrei Pangin's async-profiler than doing something custom with jstack.


>Profilers are not rocket science...

They are not. There are orders of magnitude more rocket scientists than good profiler writers. And no, bottle rockets with flight recorders do not count.


The number of profiler writers is not too large (if you ignore all the people writing the UIs), but it's the same with compilers: You only need a few to create a useful enough product. There are many more people working on tracing right now.

There are even fewer people working on the underlying profiling APIs. I'm one of the few working regularly on the stack walking API that powers most Application Platform Monitors (and async-profiler). I'm the person who writes all the code to test AsyncGetCallTrace currently. If anyone is interested: I wrote a blog post on it https://mostlynerdless.de/blog/2023/03/14/validating-java-pr....

I'm writing blog posts on the profiling topic every two weeks (usually publishing them on Monday or Tuesday).


Such a task is usually thankless work so, thank you. :)

I don't doubt my favorite profiler, YourKit also uses your code.

Profilers are probably the class of tool that most drastically improved the code I write so if anyone reading hasn't really felt the need to use one, you should.

Just a cursory glance can quickly verify assumptions about runtime performance. Even if the code is currently acceptable it helps to be mindful of where cycles are being spent (and lock contention, network blocking, etc) so if you end up doing some refactoring anyway that additional context and knowledge can guide improvements that aren't strictly "performance work".


> Such a task is usually thankless work so, thank you. :)

It's just a game of whack-a-mole, finding new segmentation faults and other bugs is quite easy when you know how to write good test code. Fixing the actual bugs less so. But it's fun and I do this for a living in the SapMachine team at SAP SE.

> I don't doubt my favorite profiler, YourKit also uses your code.

They surely do profit by work improving the profiling APIs.

> Just a cursory glance can quickly verify assumptions about runtime performance.

Yes. But there aren't enough people educating people on profilers. So I started blogging and talking (https://youtu.be/Fglxqjcq4h0) on this topic. Besides fixing all the bugs and implementing my own profile viewer based on FirefoxProfiler (https://plugins.jetbrains.com/plugin/20937-java-jfr-profiler).


That is awesome, I will be sure to checkout your viewer.


"It is not rocket science" is so 1970, I'd say nowadays it is "It is not semiconductor eng."


What is your criteria for 'good profiler writer'?


I know one when I see one


>...retrocausal models also open avenues of exploring a “time-symmetric” view of our universe, in which the laws of physics are the same regardless of whether time runs forward or backward.

... or sideways!

If you want a visual model that may help to build some intuition about that, I have one at [1]. Retrocausality is not explicitly spelled out in that model but "it's uh... uh... it's down there somewhere".

[1] https://github.com/OlegMazurov/Janus


Necessity is the mother of invention? (Голь на выдумки хитра? :( )


> Larger and larger block sizes are important. LDPC probably is the more practical methodology today, though I admit that I'm ignorant about them. Still cool to see someone try to push Reed Solomon to such an absurdly huge size though.

Multi-dimensional RS codes are an easy way to get to an absurdly huge size for real (granted they stop being MDS). Long term archiving is an obvious application. One-way communication, like in deep space, is another. Though, speed requirements are less demanding there. That one tried to push the envelope for a one-dimensional RS code to those limits is a curiosity. Some techniques may be useful in other practical approaches. It's a pity the code representation had to depart from GF(2^n).


> Multi-dimensional RS codes are an easy way to get to an absurdly huge size for real (granted they stop being MDS)

Yeah, that's the traditional way of doing things. CD-ROMs used multi-dimensional RS (the RS(28,24) was just one bit of the inner-code IIRC, which would fit inside another multidimensional RS code).

But since its no longer MDS, the "search space" of non-MDS codes is quite large. So you have the question of "which code has a higher-probability of successfully correcting errors/erasures" ??

LDPC and Turbo codes apparently have better probabilities of fixing errors/erasures, at far lower computational costs. To the point where a fair number of applications seem to prefer Turbo / LDPC codes today over multidimensional codes (built out of RS)


I'm not talking about CD-ROMs or immediate availability in any form. How would you encode a petabyte (for a starter) with LDPC/Turbo? Not available right away but accumulated over months with no predefined upper limit? Computational cost remains an important factor but other requirements take over.


> How would you encode a petabyte (for a starter) with LDPC/Turbo?

I'm fairly certain that particular problem you're asking for is solved by "fountain LDPC codes", and can't be solved by RS by any methodology.

RS is limited to its block size (in this case, 2^20 sized block), which can be "extended" by interleaving the codes (with various kinds of interleaving). But interleaving a petabyte of data is unreasonable for RS.

I admit I'm not hugely studied in practical codes, mostly just got the college-textbook understanding of them.


I can only repeat my initial statement:

> Multi-dimensional RS codes are an easy way to get to an absurdly huge size for real.

"Multi-dimensional" may mean 3,4,..,10 dimensions. "Absurdly huge" means a petabyte and beyond. "For real" means practical recovery of swaths of lost or damaged data. "Fountain LDPC codes" are essentially one-dimensional. Extending them to the ability of recovering an annihilated data center would make them impractical. Making them multi-dimensional would make them inferior to RS (which is still MDS in every single dimension).


There is Leopard codec implementing very similar approach in GF(2^n). AFAIR, it's a bit slower, but overall it looks more useful than FastECC

(OTOH, since we anyway store hashes and other info, we may store there extra overflow bits required for recoding into GF(p). That said, FastECC on practice works like a slightly non-MDS code, but at least there are guarantees how much extra data we need to store)


> But I’m here to tell you they got it wrong, and everyone’s been getting it wrong ever since. Students come away underwhelmed and baffled, and go on to become the next generation of teachers who repeat this process.

Yeah, this is how recursion works. "If it ain't broke, don't fix it" maybe?


That's why there should be another parameter: you should split the key into N parts so that any M <= N can open the lock. You can increase M adding people you don't trust 100%, say to 8, but leave N at your comfortable level, 5. Even if those 3 conspire, they would still be 2 people short of being able to break the lock. You can do 12:5, give 5 parts to your spouse and spread the remaining 7 among your relatives and friends. There will still be a single point of failure, though, if somebody steals all 5 parts from him/her. You can decide to decrease allocation to only 4 parts so that your spouse would need to cooperate with any of the other trusted parties. The point is there is enough room for designing a scheme that is both secure and reliable.


>> helped kindle the European Renaissance as we know it.

> The italian renaissance started in the 14th century for sure and maybe even in 13th century depending on who you ask. Though the exact date can be disputed, nobody disputes the fact that it started long before the fall of the byzantine empire in the mid 15th century.

The early "Renaissance" was Latin, the Renaissance proper was Greek. That's all you need to know to figure out when (1397+), where (Florence+), and how it started. The seed is more important than the soil.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: