Accelerate is highly performant on Apple hardware (the current Intel arch). I expect Apple to ensure same for their M-series CPUs, potentially even taking advantage of the tensor and GPGPU capabilities available in the SoC.
Huh, this actually may end up solving many of my issues, so thanks for finding that! Outside of their documentation being terrible, they do claim the correct algorithms, so it's something to at least investigate.
By the way, if anyone at Apple reads this, thanks for the library, but, you know, calling conventions, algorithm, and options would really help on pages like this:
That's the documentation page for an enumeration value, not a factorization routine (hence there are no calling conventions, etc, to document; it's just a constant).
There is also _extensive_ documentation in the Accelerate headers, maintained by the Accelerate team rather than a documentation team, which should always be considered ground truth. Start with Accelerate/vecLib/Sparse/Solve.h (for a normal Xcode install, that's in the file system here):
I noticed that SciPy has dropped support. I believe it wasn't only related to bugs, but also an very dated LAPACK implementation (circa 2009). I can't tell from Apple's developer docs whether this has changed.
My sense is that Apple's focus is less on scientific computing and more so on enabling developers to build computation-heavy multimedia applications.
Accelerate is also available (and highly performant) on ARM as well. I was not able to beat it with anything on ARM, including hand-coded assembly, at least for sgemm and simple dot products, which are bread and butter of deep learning. It actually baffles me that Microsoft is not offering linear algebra and DSP acceleration in Windows out of the box. This creates friction, and most devs don't give a shit, so Windows users end up with worse perf on essentially the same hardware.
ARM themselves made a half-hearted attempt at addressing this with their Ne10 project (https://github.com/projectNe10/Ne10), but as far as I could see from the outside they never committed any real resources to it, and it now seems to be abandoned (no public commits for three years).
https://developer.apple.com/documentation/accelerate/sparse_...
Accelerate is highly performant on Apple hardware (the current Intel arch). I expect Apple to ensure same for their M-series CPUs, potentially even taking advantage of the tensor and GPGPU capabilities available in the SoC.