I can't see it happening even in a far future except maybe for some artificial microbenchmarks. Java programs use a lot of memory, and since everything is allocated on the heap garbage collection is an issue, impacting real-world programs.
In terms of raw CPU speed, access to vector instructions directly can be game changer in various applications. Also C/C++ compilers are also getting better each day.
> Java programs use a lot of memory, and since everything is allocated on the heap garbage collection is an issue, impacting real-world programs.
That is a very good observation. Looking at back in the past at some point memory speed wasn't that much slower than CPU speed (rather because CPU speeds were not that fast then).
So then throwing more memory at something seems like a very good way improve performance. Like say following long chains of pointers through some nested structure or long linked list was ok. At some point CPU speed went through the roof and left memory access speeds behinds.
So then caches became very important. Cache aware programming was a "thing".
That, coupled with lots having virtualized/cloud machines everywhere that have limited memory kind of turned that initial thinking on its head. Small memory footprint became a desirable trait. Just like in the old MS-DOS & Turbo Pascal days.
Java sort of flourished and grew in that time period where "throw more ram at it to get performance" was a very obvious thing to do. Now I think it is less obvious that is the best way.
(And perhaps Java's current or future GC and JIT strategies will start to take into account caches and memory frugality better).
Now that said I am still amazed at how it can achieve such great performance given all the stuff it does behind the scenes. It is not faster than C but heck, it is very fast still.
Java still lets you make a giant array and operate on that. For high speed numerics programming you do that in every language: FORTRAN, C, Java, etc. At that point comes down to how much information the compiler and you can share: restricted pointers, use assembler kernels, etc.
In terms of raw CPU speed, access to vector instructions directly can be game changer in various applications. Also C/C++ compilers are also getting better each day.