There are a couple of issues here: scheduling, and managing the stack. The reason a runtime can do better than the kernel is that its constraints are (hopefully) less constraining than those of the kernel.
In the case of managing the stack, the kernel must be able to support languages like C/C++ and Rust, that can have internal pointers pointing into the stack itself. This makes it hard to reallocate stacks dynamically and move them around in memory. The JVM doesn't allow such pointers in application code, and internal bookkeeping pointers are known.
As to the scheduler, the kernel must support very different kinds of threads: threads that serve server transactions that block very often, and threads that, say, encode a video, that rarely block at all. These different kinds of threads are best served by different schedulers (e.g. the "frequently blocking" kind of thread is best served by a work-stealing scheduler, but that may not be the best scheduler for other kinds of threads). When you implement fibers in the runtime, you can let the developer choose a scheduler for their needs.
The OS one has more _requirements_ (the number of "it must do..." is much higher), which constrains it's design space much more. I think you can call that as "having more constraints" because it must meet more needs to be even viable as a solution.
The number of requirements for JVM threads is much lower, thus less constraining, allowing it more freedom to implement solutions that can meet its narrower window of features.
pron, isn't it simpler to just allocate everything on heap, than reallocating coroutine stacks? Or is allocating on heap significantly less performant than managing coroutine stacks (stack need to be copied, while heap allocated object can stay the same).
Also, is there any timeline or estimate when Loom will be released with official JDK?
> Also, is there any timeline or estimate when Loom will be released with official JDK?
From another place in this thread:
As OpenJDK has switched to time-based releases, we no longer plan releases based on features. When a feature is ready, it is merged into the mainline and released in the following release. We never commit to a timeline, but I would say that the probability for fibers to land in one of the two releases next year as quite high. Of course, we release projects gradually, and it's possible that some planned fiber features will not land in the first release, or that the performance in the first release will later be improved etc.
However, early access Loom binaries (with API still very much in flux) are expected in a month or so, with the intention of gathering feedback on the API. The decision on when fibers are ready to be merged will greatly depend on that feedback.
In the current Loom implementation, continuation stacks are allocated on the heap; they can grow and shrink as ArrayLists do -- the object stays the same, but the backing array needs to be reallocated.
In the case of managing the stack, the kernel must be able to support languages like C/C++ and Rust, that can have internal pointers pointing into the stack itself. This makes it hard to reallocate stacks dynamically and move them around in memory. The JVM doesn't allow such pointers in application code, and internal bookkeeping pointers are known.
As to the scheduler, the kernel must support very different kinds of threads: threads that serve server transactions that block very often, and threads that, say, encode a video, that rarely block at all. These different kinds of threads are best served by different schedulers (e.g. the "frequently blocking" kind of thread is best served by a work-stealing scheduler, but that may not be the best scheduler for other kinds of threads). When you implement fibers in the runtime, you can let the developer choose a scheduler for their needs.