> As an aside, that kABI guarantee only goes so far. I work in HPC/AI, and the out-of-tree drivers we use like MOFED and Lustre drivers would break with EVERY SINGLE RHEL minor update (like RHEL X.Y -> X.(Y+1) ). Using past form here because I haven't been using RHEL for this purpose for the past ~5 years, so maybe it has changed since although I doubt it.
I'm not sure what the underlying problem here is, is the kABI guarantee worthless generally or is it just that MOFED and Lustre drivers need to use features not covered by some kind of "kABI stability guarantee"?
I work on Lustre development. Lustre uses a lot of kernel symbols not covered by the kABI stability guarantee and we already need to maintain configure checks for all of the other kernels (SuSe, Ubuntu, mainline, etc) that don't offer kABI anyway. So in my opinion, it's not worth the effort to adhere to kABI just for RHEL. Especially when RHEL derivatives might not offer the same guarantees. DKMS works well enough, especially for something open source like Lustre.
Honestly, I'm not sure who kABI is even designed for. All of the drivers I've interacted with the HPC space (NVIDIA, Lustre, vendor network drivers, etc.) don't seem to adhere to kABI. DKMS is far more standard. I'd be interested to know which vendors are making heavy use of it.
The problem is that some vendors don't participate or care about the KABI program. They have their reasons, maybe the cost is too high to maintain RHEL compat and upstream compatability, so they simply choose the one that is the least pain to adhere to when a customer requests a fix.
If companies talked to partner engineering about their kABI requirements, I think there would be a lot less breaking however I'm sure that i'm oversimplifying the reason that they cant or wont do this.
I completely understand that the work is non-trivial and that they have many environmental pressures that affect their choices. The KABI is the olive leaf, they can take it or not.
> As an aside, that kABI guarantee only goes so far. I work in HPC/AI, and the out-of-tree drivers we use like MOFED and Lustre drivers would break with EVERY SINGLE RHEL minor update (like RHEL X.Y -> X.(Y+1) ). Using past form here because I haven't been using RHEL for this purpose for the past ~5 years, so maybe it has changed since although I doubt it.
I'm not sure what the underlying problem here is, is the kABI guarantee worthless generally or is it just that MOFED and Lustre drivers need to use features not covered by some kind of "kABI stability guarantee"?