Presumably every exactly once processing scenario needs you to squeeze things through a serial pipe at some point, or you could have 2 messages with the same ID come in and be processed in parallel?
Yes but scope/blocking/serialisation can be narrow or wide - ie. it can be per message id (highly parallel, more state to persist, one entry per id) or one for all messages of certain type/partition (not parallel, less state required, single last index for all messages of that kind).
If one-and-only-one semantics are needed and processing should be parallel, other methods have to be used.