There are a few parts of this I still struggle to understand. I don’t get why pt...

GolDDranks · on April 15, 2022

They are problem in the sense that the address of the pointer gets exposed. After you lose track of who has it and who might do what with it, you can't track the aliasing information of the pointer, so you have to suppress some optimizations. But you are correct in the sense that if int2ptr never happens, it's all good.

About side effects: we are not talking about having side effects on hardware level, we are talking about side effects from the compiler's viewpoint. Again, the compiler might track aliasing information for optimizations, and casting has the side effect of "exposing" the pointer.

kibwen · on April 15, 2022

> I don’t get why ptr2int casts are a problem if you never try and cast the integer back to a ptr.

AFAIK, you do understand. ptr2int casts are totally fine and defined behavior, as long as the program contains no int2ptr casts. Is there a passage from the OP that contradicts this?

celeritascelery · on April 15, 2022

From the section "Casts have a side-effect":

> But in this case, the operation in question is (uintptr_t)x, which has no side-effect – right? Wrong. This is exactly the key lesson that this example teaches us: casting a pointer to an integer has a side-effect, and that side-effect has to be preserved even if we don’t care about the result of the cast. ... We have to lose some optimization, as the example shows. However, the crucial difference to the previous section is that only code which casts pointers to integers is affected.

So even if we never even use the result, casting a pointer to an integer is problematic.

But in the explanation he only talks about the problems of int2ptr cast, which I do undestand.

comex · on April 15, 2022

The problem is that, if we assume that integers don’t have provenance, some far distant part of the code could guess the integer and do an int2ptr. If you can prove that nothing in the entire program could possibly do this for the entire lifetime of the original object, then sure, you could remove the ptr2int. But compiler optimizations usually work one function at a time. In some cases it might be feasible to prove this anyway, like if (a) you have a function that doesn’t call any other functions and (b) the object in question is a local variable that will go out of scope at the end of the function, making any further accesses UB regardless. But in most cases it’s not feasible.

ralfj · on April 18, 2022

Indeed int2ptr is the "evil" operation. If we banned it, we could get rid of all this "exposed" stuff and ptr2int would be fine. However, in order to make int2ptr work, we have to also make ptr2int a bit more complicated. That's what the example shows: removing a ptr2int introduced UB into the program.

Rust now (experimentally) has an `ptr.addr()` operation that is like ptr2int without the "expose" part, i.e., the resulting integer cannot be cast back but still used for other purposes.

mlindner · on April 15, 2022

I'd assume because it has something to do with the idea is that an optimizing compiler can't completely delete the address entirely anymore if it's being used for "something". For example optimizing it away to a register only variable.

kevin_thibedeau · on April 15, 2022

> they are a No-op on the hardware.

That is not guaranteed. The only guarantee is that you can round trip conversions via (u)intptr_t. The integer representation of a converted pointer can be completely different to accommodate hardware like the Symbolics lisp machine.

celeritascelery · on April 15, 2022

Are we losing optimization on x86/arm due to mere existence of other hardware (like symbolics or CHERI) that handles things differently?

astrange · on April 20, 2022

You don't lose the optimizations because of UB and aliasing rules letting them stay in, but the people who want to make C safer by simply defining all UB would lose you all these optimizations.

ARM already includes a small part of CHERI (pointer signing) and the rest is coming.

kmonsen · on April 15, 2022

I mean you are correct, but why else would a pointer be converted to it if not to cast it back at some point? I guess you can print it for debugging, but most other uses means it will be used as pointer at some point.

pcwalton · on April 15, 2022

Another reason for converting pointers to integers without ever doing the reverse operation is to hash them.

rsaxvc · on April 15, 2022

Some hand-vectorized code will do this to compute the number of non-vectorized elements that may exist before and after a suitably aligned region.