The current default collector doesn't give memory back to the OS. So if you have several peaky memory usage apps, you can't try and get them to elastically negotiate heap size tradeoffs with one another - you need to pack them in with max heap limits manually. That requires a lot of tuning, and it's still less than theoretically optimal.
We fork a child JVM to run our peakiest jobs for just this reason. Also help keep services up when something OOMs.
> The current default collector doesn't give memory back to the OS.
That's a pretty irrelevant point, as the current default collector in Sun's JVM does reduce the Java heap based on tuneable parameters. While it doesn't return the virtual address space to the OS, that generally doesn't impact memory consumption on the "current default" OS's. (Certainly there are specialized cases where you might care about that, and for that there are other collectors and other JVM's for that matter.)
> So if you have several peaky memory usage apps, you can't try and get them to elastically negotiate heap size tradeoffs with one another - you need to pack them in with max heap limits manually.
That's simply not true. The default GC does adjust heap size based on utilization, so you absolutely can run peaky apps that manage to negotiate different times for their peaks in a constrained memory space.
> We fork a child JVM to run our peakiest jobs for just this reason.
Well, I guess that's one way to address the problem, but you've unfortunately misunderstood how your tool works.
> Well, I guess that's one way to address the problem, but you've unfortunately misunderstood how your tool works.
No, I don't think you have the context.
The peaky process will be killed for OOM by Linux; we explicitly don't want services to die, which they would if they lived in the same process. So, the services live in the parent process, and the peaky allocation happens in the child process. For context, at steady state the services consume about 2GB, whereas the peaky process may consume 30GB for 30 minutes or a couple of hours. We use resource-aware queuing / scheduling to limit the number of these processes running concurrency.
It's true that G1 will, under duress (e.g. under micro-benchmark scenarios with explicit calls to System.gc()), give up some heap to the OS, but it's not what you see in practice, without exceptional attention paid to tuning. Process exit is particularly efficient as a garbage collector though.
The OOM killer kicks in when you run out of virtual memory, not physical memory. If you genuinely have processes that only periodically actually need their heap to be large, but don't return unused memory to the OS, you can simply allow the OS to page out the address space that isn't currently used. There are subtle differences between returning address space to the OS, and simply not using address space, but they aren't the kind of differences that impact your problem.
G1's heap sizing logic is readily adjustable. The old defaults did rarely return memory to the OS, but you could tune them to suit your needs. Either way, this is no longer accurate an accurate representation of G1's behaviour as the runtime has adapted to changing execution contexts: https://bugs.openjdk.java.net/browse/JDK-8204089
We fork a child JVM to run our peakiest jobs for just this reason. Also help keep services up when something OOMs.