Yes probably. But considering non-deterministic outputs is the nature of the beast with Llms and we're (mostly) engineers here, calling any part of this mundane sounds almost more like fighting words than just observation
Extremely pedantic, but is "non-deterministic" really the right language? The same input will always produce the same output, provided you haven't intentionally configured the system to use the model non-deterministically. It seems like the right way to describe it is as a chaotic deterministic system. The same input will always produce the same output, but small shifts in the input or weights can result in dramatic and difficult to predict changes in outputs.
> The same input will always produce the same output
Not guaranteed even with the same seed. If you don't perform all operations in exactly the same order, even a simple float32 sum, if batched differently, will result in different final value. This depends on the load factor and how resources are allocated.
Yeah, the fact that floating point multiplication isn't associative is a real pain for producing deterministic outputs - especially when you're running massively parallel computations on GPUs (or multiple GPUs) making the order of operations even less predictable.