There are 3 reserved bytes in each of the handshake messages [0] which I imagine would be used for this purpose. There's also a full byte for message type, of which I think only 0x1 through 0x4 are currently defined.
While this is a possibility it is still strange to not specify something like the protocol version intentionally. Even if in an updated implementation these fields would be used for something like that they still couldn't communicate properly with an older implementation which doesn't understand the new semantics of these fields.
Also looking at the struct it seem the three bytes are only reserved in order to align the following fields on 4-byte boundaries.
If the authors had the intent to allow for some kind of asynchronous update path then surely this would be built explicitly into the protocol right from the start.
Fortunately I know exactly what I was thinking. Each message has an explicit type. The set of types exists in the first byte with the remaining three reserved for future additional use, perhaps for naming types. The cryptography is tied to the version by way the identifier and construction constants.
[0] Page 10 onwards of https://www.wireguard.com/papers/wireguard.pdf