Python should and will be replaced, but not at all for any of the reasons mentio...

ddragon · on Aug 18, 2019

Do you have an example of a tensor library that keep track of shapes and detect mismatches at compile time? I had the impression that even in static languages having tensors with the exact shape as a parameter would stress the compiler, forcing it to compile many versions of every function for every possible size combination, and the output of a function could very well have a non deterministic or multiple possible shapes (for example branching on runtime information). So they compromise and make only the dimensionality as a parameter, which would not catch your example either until the runtime bound checks.

_pd19 · on Aug 18, 2019

If you explain a little more of what you mean, I might be able to respond more effectively.

> I had the impression that even in static languages having tensors with the exact shape as a parameter would stress the compiler, forcing it to compile many versions of every function for every possible size combination, and the output of a function could very well have a non deterministic or multiple possible shapes (for example branching on runtime information).

I was a bit lazy in my original comment - you're right. What I really think should be implemented (and is already starting to in Pytorch and a library named NamedTensor, albeit non-statically) is essentially having "typed axes."

For instance, if I had a sequence of locations in time, I could describe the tensor as:

(3 : DistanceAxis, 32 : TimeAxis, 32 : BatchAxis).

Sure, the number of dimensions could vary and you're right that, if so, the approach implied by my first comment would have a combinatorial explosion. But if I'm contracting a TimeAxis with a BatchAxis accidentally, that can be pretty easily caught before I even have to run the code. But in normal pytorch, such a contraction would succeed - and it would succeed silently.

ddragon · on Aug 18, 2019

You understood it correctly. Named dimensions is certainly a good idea even in dynamic languages as a way of documentation and runtime errors that actually make sense (instead of stuff like "expected shape (24, 128, 32) but got (32, 128, 24)"). I hope it catches on.

_pd19 · on Aug 18, 2019

But combined with static checking, it could be very very powerful. Definitely agree re: the power even in dynamic languages, I use namedtensor for almost all my personal development now (less so in work because of interop issues)

vbarrielle · on Aug 18, 2019

There's a research language that supports compile-time checking of array dimensions: futhark [1]. It's an interesting language that compiles to CUDA or OpenCL. However it's probably not ready for production (not sure if there's even good linear algebra implementations yet). It does feature interesting optimizations to account for the possible size ranges of the arrays (the development blog is very instructive in that respect).

[1] https://futhark-lang.org/

ddragon · on Aug 18, 2019

Thanks, I'll look into it.

Julia for example does have an array library that does compile-time shape inference (StaticArrays), but it cannot scale for large arrays (over 100 elements) exactly because it gets too hard for the compiler to keep track, I'm definitely curious about possible solutions.

mruts · on Aug 19, 2019

Normal static typing won't help you there. You would need some sort of dependent typing. For example, look at Idris.