> The other option would maybe be to make the entire first page of memory non-readable, and then have the MMU execute the trap for you (this is how real OSes does this null check, right?), but then you're wasting 4kb in every WASM runtime. Doesn't seem ideal.
You're only wasting virtual address space, not actual RAM. So the only cost is your max memory possible decreases by 4kb. Hardly a noteworthy expense.
And this is an expense that WASM runtimes already have to spend anyway, since none of them are running on bare metal with address 0x0 being available to them. So there's already address translations in play. Multiple of them, in fact, as you have to first translate from the WASM heap to the host's address space, and then from that through the MMU to physical pages.
And of course the WASM runtime already needs to trap for invalid addresses (eg, -1 right out the gate). How would trapping the first few pages be any more expensive than any other unmapped range? You're still working with a single contiguous region (at least until mmap finally lands). There doesn't seem to be any benefit to starting from 0x0 vs. starting from any other constant.
> You're only wasting virtual address space, not actual RAM.
Fair enough, that's true.
> And of course the WASM runtime already needs to trap for invalid addresses (eg, -1 right out the gate). How would trapping the first few pages be any more expensive than any other unmapped range?
Memory locations are presumably all unsigned, so -1 is not an issue. The thing I'm saying is that you're turning a test that was `if (addr < LIMIT)` into `if (addr != 0 && addr < LIMIT)`, which is not nothing. And it's on EVERY memory access.
Thinking about it, there's no way you could rely on the MMU to be the ONLY memory check obviously. Like, the JIT can't translate a reference to `x = memory[ptr]` into just a `mov x, [BASE + ptr]`, because it could reference non-WASM memory, which is obviously a security no-no (unless it could statically guarantee `ptr < LIMIT`). You need some kind of runtime check in addition for security and stability, you can't get around it. Given that, I think I agree with you, might as well just do it. It's, like, two or three extra instructions or something? But I would not say that it's free.
> The thing I'm saying is that you're turning a test that was `if (addr < LIMIT)` into `if (addr != 0 && addr < LIMIT)`, which is not nothing. And it's on EVERY memory access.
But that's not what I'm saying. I'm saying every memory access would become:
addr = addr - START_OFFSET (eg, 0x1000)
if (addr < LIMIT) { do stuff }
There's just an extra sub in there. And you can definitely optimize that, same as any other bounds check optimization. Which the runtime is already tasked with doing since all your loads are already (wasm_address + host_array_buffer_ptr). If that add is free, the early sub likely is, too. They fall into the same category here.
But you can also make it completely free by just ensuring there's a dead zone before the host array buffer location, such that anything in the range 0x0-0x1000 (or whatever) just lands before the array buffer's start location which has been mmap'd to trap. Then you don't need any changes at all.
> Thinking about it, there's no way you could rely on the MMU to be the ONLY memory check obviously. Like, the JIT can't translate a reference to `x = memory[ptr]` into just a `mov x, [BASE + ptr]`, because it could reference non-WASM memory, which is obviously a security no-no (unless it could statically guarantee `ptr < LIMIT`). You need some kind of runtime check in addition for security and stability, you can't get around it.
On 64-bit systems, you actually can, since WASM is for now 32-bit only; just reserve a 4GB (plus 1 page) block of virtual memory starting at BASE, and there's no way for BASE+ptr (assuming ptr is 32-bit unsigned) to reach outside it (the extra 1 page after the end is to catch unaligned accesses at the very end of that 4GB).
That is, you can statically guarantee "ptr < LIMIT" if "ptr" is 32 bits and LIMIT is 2^32 or more.
You're only wasting virtual address space, not actual RAM. So the only cost is your max memory possible decreases by 4kb. Hardly a noteworthy expense.
And this is an expense that WASM runtimes already have to spend anyway, since none of them are running on bare metal with address 0x0 being available to them. So there's already address translations in play. Multiple of them, in fact, as you have to first translate from the WASM heap to the host's address space, and then from that through the MMU to physical pages.
And of course the WASM runtime already needs to trap for invalid addresses (eg, -1 right out the gate). How would trapping the first few pages be any more expensive than any other unmapped range? You're still working with a single contiguous region (at least until mmap finally lands). There doesn't seem to be any benefit to starting from 0x0 vs. starting from any other constant.