Hacker News new | past | comments | ask | show | jobs | submit login

You'll need to be more specific.

From a technical (i.e. useless) point of view SBCL very nearly already has a real-time GC, with a few modifications it would qualify since already:

1. The amount of work in a GC operation is bounded by the heap size

2. SBCL has a fixed maximum heap size

Two things would need to be done:

1. Ensure the areas where GC is inhibited are bounded

2. Call GC operations on a timer, and when they are done, ensure there is sufficient free space; RTGC cannot exist without specific bounds on the mutator, so you could almost certainly invent bounds on the mutator that would make the gencgc qualify as real-time for.

#1 would need to be done for any actually useful RTGC anyways.

A slightly less snarky answer is that a non-toy GC is a lot of work. Different choices in GC will affect code-generation (e.g. read and/or write barriers GC, and forwarding-pointers for incremental GC[1]).

There's a reason the gencgc has been around as long as it has, and it's because it's "good enough" for a lot of people and the work needed to replace it (with any non stop-the-world GC anyways) is rather large. Even TFA is neither incremental nor concurrent, just parallel.

1: Stop the World GCs may also use forwarding pointers, but codegen doesn't need to know about them because they are fully resolved before the mutator continues.




> From a technical (i.e. useless) point of view SBCL very nearly already has a real-time GC

If your technical (theoretical) definitions are useless, you need to use different definitions. A practical definition of “real-time” GC is a minimum mutator utilization bound for any program given sufficent RAM, but the only GC I’m aware of that does this (despite the abundance of papers on GC wirh “real-time” in the title) is Klock’s regional collector[1]. Unfortunately, it seems to be impractically complex. [To be fair, I don’t think any implementation of malloc() in common use would satisfy this constraint either.]

[1] https://eschew.wordpress.com/2016/09/02/summarizing-gc/


The customary way to improve GC performance in Common Lisp is to avoid consing. Which, incidentally, is the customary way to write high performance C code too. Allocate your memory in a piece up front and arrange the data in a way that's friendly to your architecture and then do stuff. Nice thing about Common Lisp is that DISASSEMBLE makes it a fairly smooth process to check that the optimizer is actually doing what you want.

And the new SBCL release aids this approach by expanding the cases where the compiler will stack allocate.


> If your technical (theoretical) definitions are useless, you need to use different definitions...

The question of "what is a real-time GC" is academic because nobody actually wants a real-time GC. They want a real-time system. And some GC algorithms can be used to construct a real-time system. Indeed, as others have noted, SBCL can be used to construct a real-time system if you don't do any allocations.


As far as I know, there is a better definition that is commonly used.

A soft real time system is one in which every instruction that the programmer writes has a known well-defined maximum time bound in which it executes. This can be achieved in a GC system if the GC pause time also has a known well-defined time bound, and if it is well defined which instructions can or can't be paused by the GC for that (max) amount of time.

This constraint is important for building systems because it allows the programmer to guarantee that certain parts of their program will fit within a given time bound, such as ensuring that they can fit in the window for rendering a frame to achieve 60FPS or per-packet processing to achieve 1G PPS etc.

The commercial Java implementation JamaicaVM claims to have such a soft real time guarantee, with maximum GC pause times in the order of microsec. They have a paper about it presented at the ACM [0]. I haven't read the paper myself or used their product in any way, I'm only presenting their claims.

Hard real time systems require a fixed time per operation, which would allow one to guarantee that, say, a given loop executes some operation at a precise frequency, which may be critical for things like analog signal processing. This is typically much harder to achieve using a GC system.

[0] https://dl.acm.org/doi/10.1145/1288940.1288954


> A practical definition of “real-time” GC is a minimum mutator utilization bound for any program given sufficent RAM

I'm going to push back against that definition since:

1) RAM is usually bounded

2) A "never free" allocation strategy meets this requirement.

3) The "minimum mutator utilization bound" needs to be at a given time-slice anyways; e.g. "No more than 50% utilization in any 10ms window" rather than just "No more than 50% utilization." Throughput optimized GCs already give a very low utilization over a very long window; it's narrow windows that they struggle with.

I think a practical definition gives minimum bounds for:

M: fraction mutator utilization over a narrow time window T (where narrow depends on the problem domain; the orders of magnitude from 10us to 100ms are all reasonable depending on the domain)

O: heap size overhead (as a greater than unity multiple of live data)

And conditions them on peak allocation rate R, and heap size S, which each may have some reasonable maximum value.

If you pick non-reasonable values for R, T, and S then "obviously not realtime" GCs can be made to qualify. If you don't condition your GC on R, T, and S then actual shipping real-time systems using GC don't qualify (e.g. Metronome/Stacatto based systems).


>> [...] given sufficient RAM.

> A "never free" allocation strategy meets this requirement.

Yeah, I could have put that better. “Sufficient” probably needs to be defined as a reasonable function of the program’s working set—I’d have said “independent of the program” instead of “reasonable”, except that’d make any non-compacting malloc necessarily non-real-time, as the (Robson) bound[1] there has to depend not only on the working set but also on the ratio between the smallest and largest allocations. On the other hand, maybe forbidding non-compacting mallocs is actually reasonable, because that bound is impractically large anyway.

> The "minimum mutator utilization bound" needs to be at a given time-slice anyways; e.g. "No more than 50% utilization in any 10ms window" rather than just "No more than 50% utilization."

No, the “minimum” in the name refers to a minimum (proportion of time) over all windows. See link in GP and the thesis referenced there for a discussion of the dependency on the window size (which is a bit funny for all allocators).

[1] https://www.sqlite.org/malloc.html


> No, the “minimum” in the name refers to a minimum (proportion of time) over all windows. See link in GP and the thesis referenced there for a discussion of the dependency on the window size (which is a bit funny for all allocators).

That's just false; I can't think of a GC that can maintain more than 0% utilization in a time-window the length of a compare-and-swap operation (which could be 100s of cycles on some CPUs)


thanks




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: