How about we put it this way: all forms of dynamic memory management have overhead, including malloc(). The overhead for garbage collection is different from the overhead for malloc(); GC is worse some respects (latency, space usage) but better in other respects (throughput, development time).
GC can be much faster than malloc() when allocating objects, depending on the GC scheme used and the heap profile, allocation savings may outweigh the cost of collection.
I'd be wary of that paper: of the 5 garbage collectors they have tested, only one appears to be generational. That makes me doubt they used sufficiently state of the art garbage collection.
GC can be much faster than malloc() when allocating objects, depending on the GC scheme used and the heap profile, allocation savings may outweigh the cost of collection.
So "No GC" is a completely separate point.