Hacker News new | past | comments | ask | show | jobs | submit login

No, autograd acts similarly to PyTorch in that it builds a tape that it reverses while PyTorch just comes with more optimized kernels (and kernels that act on GPUs). The AD that I was referencing was tangent (https://github.com/google/tangent). It was an interesting project but it's hard to see who the audience is. Generating Python source code makes things harder to analyze, and you cannot JIT compile the generated code unless you could JIT compile Python. So you might as well first trace to a JIT-compliable sublanguage and do the actions there, which is precisely what Jax does. In theory tangent is a bit more general, and maybe you could mix it with Numba, but then it's hard to justify. If it's more general then it's not for the standard ML community for the same reason as the Julia tools, but then it better do better than the Julia tools in the specific niche that they are targeting. That generality means that it cannot use XLA, and thus from day 1 it wouldn't get the extra compiler optimizations that some which uses XLA does (Jax). Jax just makes much more sense for the people who were building it, it chose its niche very well.



FYI - Tangent evolved into TF2's AutoGraph.


Indeed, and that makes a lot of sense. The qualm of tangent is that you get the source code translation but without the additional optimizations that the technique can provide. It was then natural to just target TensorFlow/XLA to do a similar thing but get the performance of TensorFlow as a result. The downside is that it loses the one true upside of tangent which was that, by generating Python code, it could in theory be easier for a Python programming to debug. But this was probably the right sacrifice to make for most people.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: