Python should and will be replaced, but not at all for any of the reasons mentioned in this thread.
A good ML language is going to need smart and static typing. I am so tired of having to run a whole network just to figure out that there's a dimension mismatch because I forgot to take a transpose somewhere - there is essentially no reason that tensor shapes can't just be inferred and these errors caught pre-runtime.
Do you have an example of a tensor library that keep track of shapes and detect mismatches at compile time? I had the impression that even in static languages having tensors with the exact shape as a parameter would stress the compiler, forcing it to compile many versions of every function for every possible size combination, and the output of a function could very well have a non deterministic or multiple possible shapes (for example branching on runtime information). So they compromise and make only the dimensionality as a parameter, which would not catch your example either until the runtime bound checks.
If you explain a little more of what you mean, I might be able to respond more effectively.
> I had the impression that even in static languages having tensors with the exact shape as a parameter would stress the compiler, forcing it to compile many versions of every function for every possible size combination, and the output of a function could very well have a non deterministic or multiple possible shapes (for example branching on runtime information).
I was a bit lazy in my original comment - you're right. What I really think should be implemented (and is already starting to in Pytorch and a library named NamedTensor, albeit non-statically) is essentially having "typed axes."
For instance, if I had a sequence of locations in time, I could describe the tensor as:
Sure, the number of dimensions could vary and you're right that, if so, the approach implied by my first comment would have a combinatorial explosion. But if I'm contracting a TimeAxis with a BatchAxis accidentally, that can be pretty easily caught before I even have to run the code. But in normal pytorch, such a contraction would succeed - and it would succeed silently.
You understood it correctly. Named dimensions is certainly a good idea even in dynamic languages as a way of documentation and runtime errors that actually make sense (instead of stuff like "expected shape (24, 128, 32) but got (32, 128, 24)"). I hope it catches on.
But combined with static checking, it could be very very powerful. Definitely agree re: the power even in dynamic languages, I use namedtensor for almost all my personal development now (less so in work because of interop issues)
There's a research language that supports compile-time checking of array dimensions: futhark [1]. It's an interesting language that compiles to CUDA or OpenCL. However it's probably not ready for production (not sure if there's even good linear algebra implementations yet). It does feature interesting optimizations to account for the possible size ranges of the arrays (the development blog is very instructive in that respect).
Julia for example does have an array library that does compile-time shape inference (StaticArrays), but it cannot scale for large arrays (over 100 elements) exactly because it gets too hard for the compiler to keep track, I'm definitely curious about possible solutions.
A good ML language is going to need smart and static typing. I am so tired of having to run a whole network just to figure out that there's a dimension mismatch because I forgot to take a transpose somewhere - there is essentially no reason that tensor shapes can't just be inferred and these errors caught pre-runtime.