One of the gripes that I have with Julia is that if you write linear algebra cod...

thebooktocome · on June 20, 2021

You're comparing dynamically resizable Julia arrays against Eigen's static arrays. Julia does have a static array package, StaticArrays.jl.

logimame · on June 20, 2021

Oh, you're right. I naively thought that the Julia code was written with static-size arrays.

But still, wouldn't even StaticArrays.jl create some unnecessary allocations and copies when written in the naive (clean) way? (for example, when calculating something like z = Ax + By, Ax and By are temporary allocated in Julia's case but you can avoid this in Eigen?)

eigenspace · on June 20, 2021

StaticArrays.jl will create arrays that are stack allocated and optimized just like in Eigen. There are no temporaries in that example.

Even if you used MArrays, which are small statically sized mutable arrays, the intermediate temporary arrays would live on the stack and thus not cause any allocations.

thebooktocome · on June 20, 2021

I suppose I should caveat my grandparent comment by saying that StaticArrays and MArrays are only good up to 14 x 14, if memory serves me correctly. That rule of thumb might be out of date; the compiler might construct large tuples more efficiently these days.

eigenspace · on June 20, 2021

I believe Eigen has similar problems, but im not sure.

However, there is work happening for large stack allocated arrays in julia. StrideArrays.jl can create stack allocated arrays that work on any size that will physically fit on the stack.

https://chriselrod.github.io/StrideArrays.jl/dev/stack_alloc...

This package is work in progress though and has some rough edges.

thebooktocome · on June 20, 2021

That's exciting news :)

zmk_ · on June 20, 2021

In this code, the main problem---I think---is that there are intermediate results that are being allocated, e.g., Fk_1 * Pu * Fk_1'. I will speculate that you could improve on the baseline code by preallocating these in the same way as Pp, K, aux1, and Pu are initialized outside of the loop.

antoine-levitt · on June 20, 2021

Are you sure that the difference is due to the allocations? I would expect this to be dominated by matrix multiplies or svds. Are you comparing this with the same blas/LAPACK?

Edit : OK, I see those are small matrices. Then Staticarrays should be a nice contender here, both for speed and readability.