Yep, this is the best practical advice at the moment. Well, for list CRDTs. State CRDTs (like a counter) are small and fast, and kinda better than OT in every way.
List ("operation based") CRDTs and OT systems are "equivalent" in a very academic sense that nobody really talks about or understands. Its really not obvious unless you've been staring at this stuff for years but the equivalence is there:
You can make a CRDT out of any OT system by just shipping the entire history of operations to each peer. List CRDTs essentially do that, with a whole lot of tricks to compress that data set and use it without needing to linearly scan.
And you can convert the other way too. You can add a "rename" operation into a list CRDT which assigns a new name to each element currently in the document. Before the rename operation document "hello" might have IDs [a4, b2, b3, b1, a5]. The rename operation changes the IDs to [c1, c2, c3, c4, c5]. When an operation happens you specify the version and the ID at that version of the predecessor (eg c2). The insert happens there. Then you need a method to take the ID at one version and "transform" it to the ID of the same item at a different version. Do the rename operation implicitly after every change, and viola! You now have OT semantics. "Insert after c1" means "Insert after position 1".
OT systems have one big advantage which is that you don't have to ship the CRDT state to every peer. With a rename operation, we can add back the operational simplicity of OT systems into a CRDT. But the code is (and always will be) much more complicated. So I think OT makes sense for strictly server-client systems.
You can also have a hybrid server, which talks CRDT to full peers on the network but just does OT when talking to browser clients and things like that. We talked about this at our public braid meeting at the start of the week. The discussion about this stuff starts about 30 minutes in: https://braid.org/meeting-15
> OT systems have one big advantage which is that you don't have to ship the CRDT state to every peer... You can also have a hybrid server, which talks CRDT to full peers on the network but just does OT when talking to browser clients and things like that.
Could you clarify what you mean? Assuming your CRDT is defined in terms of "operations" that contain (at minimum) an identifier+sequence tuple, zero or more references to other operations, and a value (as they are in this article) then there's no reason why you couldn't just ship a batch of individual operations to other clients when something changes rather than the whole state, since each operation is defined in absolute terms.
In other words, if you start with [A4="a", B2="b", B3="c", B1="d", A5="e"] at site A, and it gets turned into [A4="a", B2="b", B4="f", B3="c", B1="d", A5="e"] following a change from B, you can ship something like B4="f"->B2 to C as long as C's CRDT has synced up to version vector A5|B3. (And if it hasn't synced up yet, and you're not using a transport with causal delivery guarantees, the change could be cached at C until its dependencies have arrived.)
I don't think there's any need to transition to an OT system or to add renames in order to get this delta-shipping benefit: all the data you need is already there, unless I'm missing something. (But maybe you're describing something else?)
Yes, my point was that the peer needs to translate a user’s insert of “insert f at position 3” into “insert f between ID B2 and B3”. To do that, you need the “crdt chum” - you basically need that peer to know the ID of every item in the document. This data compresses well, but it’s still annoying to ship around and complex to manage. OT doesn’t need any of that.
> And you can convert the other way too. You can add a "rename" operation into a list CRDT which assigns a new name to each element currently in the document.
Operations in a CRDT must be commutative for merge/update to be well-defined, so it's not immediately clear how a "rename" operation can be expected to work properly.
List ("operation based") CRDTs and OT systems are "equivalent" in a very academic sense that nobody really talks about or understands. Its really not obvious unless you've been staring at this stuff for years but the equivalence is there:
You can make a CRDT out of any OT system by just shipping the entire history of operations to each peer. List CRDTs essentially do that, with a whole lot of tricks to compress that data set and use it without needing to linearly scan.
And you can convert the other way too. You can add a "rename" operation into a list CRDT which assigns a new name to each element currently in the document. Before the rename operation document "hello" might have IDs [a4, b2, b3, b1, a5]. The rename operation changes the IDs to [c1, c2, c3, c4, c5]. When an operation happens you specify the version and the ID at that version of the predecessor (eg c2). The insert happens there. Then you need a method to take the ID at one version and "transform" it to the ID of the same item at a different version. Do the rename operation implicitly after every change, and viola! You now have OT semantics. "Insert after c1" means "Insert after position 1".
OT systems have one big advantage which is that you don't have to ship the CRDT state to every peer. With a rename operation, we can add back the operational simplicity of OT systems into a CRDT. But the code is (and always will be) much more complicated. So I think OT makes sense for strictly server-client systems.
You can also have a hybrid server, which talks CRDT to full peers on the network but just does OT when talking to browser clients and things like that. We talked about this at our public braid meeting at the start of the week. The discussion about this stuff starts about 30 minutes in: https://braid.org/meeting-15