What I am interested in is a programmable vector fabric that I can reconfigure f...

What I am interested in is a programmable vector fabric that I can reconfigure fast (in say L2/L3 access times). right now there are 100s of AVX2 instructions & as the number of HPC applications grow it is only going to explode. My problem is that when you are using one vector instruction the silicon for the rest is just sitting around when it could be perhaps used to make a much wider SIMD unit for just the instructions I'll be using. If they can get just that right IMO it'll be a huge success.

IMO the problem with FPGA or even silicon dev is mostly tooling. I often say this, the biggest contribution to Open SOurce movement is not of linus/linux ..its gcc. just imagine if 'they' could tangle up every bit of open source code in IP litigation emerging from proprietary compilers.