an array of arrays is not necessarily contiguous in C(++). Indeed, if allocating...

int_19h · on Nov 28, 2018

An array of arrays is necessarily contiguous in C - this is implied by the type. An array of pointers to arrays will, of course, not be contiguous - and is the only way to get a dynamically sized heap-allocated 2D array in C (VLAs give you stack-allocated arrays, with all the size limits that entails).

In C++, this all is best handled by a library class.

cygx · on Nov 28, 2018

Heap-allocated dynamically-sized NxM matrix in C99:

    double (*mat)[M] = calloc(N * M, sizeof (double));

int_19h · on Nov 28, 2018

Ah, good point. I always forget that VLA types in C99 are actually types, and so you can use them in these contexts as well.

It's a shame they killed VLAs as a mandatory language feature. They didn't make C into Fortran (which I think was the hope, between them, complex numbers, and "restrict"?), but they did make some things a great deal more pleasant to write.

cf498 · on Nov 28, 2018

>and is the only way to get a dynamically sized heap-allocated 2D array in C (VLAs give you stack-allocated arrays, with all the size limits that entails).

Why wouldnt you be able to create a dynamic array of arrays with placement new and a cast.

int_19h · on Nov 28, 2018

A cast to what, though? You need to be able to write the type of that array, but you can't do that unless dimensions are compile-time constants.

Const-me · on Nov 28, 2018

> A good tensor implementation accounts for strides that are SIMD compatible (eg; each dimension is a multiple of the SIMD register width).

Never implemented any tensors, but in my experience, sometimes you better do what GPUs do with some texture formats: switch from linear layout into dense blocks layout. E.g. for dense float32 matrix and SSE code, a good choice is a 2D array of 4x4 blocks: exactly 1 cache line, and only consumes 4 out of 16 registers while being processed i.e. you can do a lot within a block while not hitting any RAM latency.