You really don't want to put >2GB in a single protobuf (or JSON object). That wo...

You really don't want to put >2GB in a single protobuf (or JSON object). That would imply that in order to extract any one bit of data in that 2GB, you have to parse the entire 2GB. If you have that much data, you want to break it up into smaller chunks and put them in a database or at least a RecordIO.

Cap'n Proto is different, since it's zero-copy and random-access. You can in fact read one bit of data out of a large file in O(1) time by mmap()ing it and using the data structure in-place.

Hence, it makes sense for Cap'n Proto to support much larger messages, but it never made sense for Protobuf to try.

Incidentally the 32-bit limitation on Protobuf is an implementation issue, not fundamental to the format. It's likely some Protobuf implementations do not have this limitation.

(Disclosure: I'm the author of Protobuf v2 and Cap'n Proto.)