Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

  int* oracle();
  int foo() {
      int x = 1;
      *oracle() = 42;
      return x;
  }
Is the above program allowed to return anything other than 1 in your language?


To elaborate, we treat pointers as more than just integers because it gives optimizers the latitude to reorder and eliminate pointer operations. In the example above we cannot do this, because we cannot prove at compile time that x doesn't live at the address returned by oracle.

For some high-quality further discussion, see Ralf Jung's series of blog posts starting with https://www.ralfj.de/blog/2018/07/24/pointers-and-bytes.html


  However, given how low-level a language C++ is, we can actually break this assumption by setting i to y-x. Since &x[i] is the same as x+i, this means we are actually writing 23 to &y[0].
But that is undefined, you can't do x + (y - x) ie a pointer arithmetic that ends outside of bounds of an array. Since it is undefined, shouldn't C++ assume that changing x[..] can't change y[0]

edit: welp, if I read a few more lines into article I would see that it also tells it is undefined


to be clear, in my example the result of oracle() cannot possibly alias with 'x' in C or C++ (and in fact gcc will optimize accordingly). In a different language where addresses are mere integers, things would be more complicated.


The result of oracle can point to anything if you write it as return (int *)rand();

Note that rand() returns 32-bit value so you have to call it twice and merge the results to obtain a 64-bit pointer.


The numerical value returned by oracle might physically match the address of the stack slot for 'x', assuming that it exists, but it doesn't mean that, from a language point of view, it is a valid pointer.

If forging pointers had defined behaviour, it would be impossible to use the language sanely or perform any kind of optimization.


Is it allowed to return anything else in C? Is there anything in standard C that would allow oracle() to access memory address of x?

Sure different compilers might allow inlining assembly or some other ways to access x on previous stack perhaps but then it is not really "C"


That’s the point. C allows this function to be optimized to always return 1. A “pointers are addresses, just emit reads and writes and stop trying to be so clever” version of C would require x to be spilled to the stack, then the write, then reload x and return whatever it contained.


Then use the register keyword or just reword the standard to assume the register behavior if a variables address hasn't been taken.

The majority of useful optimizations can be kept in a "Sane C" with either code style changes (cache stuff in local vars to avoid aliasing for example) or with minor tweaks to the standard.


Register behavior is what you want essentially all of the time. So we’d just have to write `register` all over the place for no gain.

“Don’t optimize this, read and write it even if you think it’s not necessary” is a very rare case so it shouldn’t be the default. If you want it, use the volatile keyword.

There’s no need to reword the standard to assume the register behavior if the variable’s address hasn’t been taken. That’s already how it works. In this example, if you escape the value of `&x`, it’s not legal to optimize this function to always return 1.


When using C, this can return anything (or crash of oracle function returns an invalid pointer, or rewrite its own code if the code section is writable). So if you get rid of "abstract machine", nothing changes - the program can return anything or crash.


The point is that the C standard does guarantee that the function returns 1 if the program is a valid C program - which means there is no UB.

For example: If the oracle function returns an invalid pointer, then dereferencing that pointer is UB, and therefore the program isn't a valid C program.


A conforming C compiler is allowed to emit that function to perform the write and then return the constant 1. Should that be allowed?


Well even in C is not guaranteed to return anything other than 1, since oracle() may return the memory address of variable 1.


the literal 1 is not an object in C or C++ hence it does not have an address. If you meant 'x', then also no, oracle() can't return the address of 'x' because of pointer provenance rules.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: