Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

How does gcc infer anything about memcpy? Can't I replace the c-library memcpy with my own, so how does it know that dest == NULL can never be true?


You can, but gcc may replace it with an equivalent set of instructions as a compiler optimization, so you would have no guarantee it is used unless you hack the compiler.

On a related note, GCC optimizing away things is a problem for memset when zeroing buffers containing sensitive data, as GCC can often tell that the buffers are going to be freed and thus the write is deemed unnecessary. That is a security issue and has to be resolved by breaking the compiler’s optimization through a clever trick:

https://github.com/openzfs/zfs/commit/d634d20d1be31dfa8cf06e... 12352

Similarly, GCC may delete a memcpy to a buffer about to be freed, although I have never observed that as you generally don’t do that in production code.


> Similarly, GCC may delete a memcpy to a buffer about to be freed, although I have never observed that as you generally don’t do that in production code.

It's not that crazy. You could have a refcounted object that poisons itself when the refcount drops to zero, but doesn't immediately free itself because many malloc implementations can have bad lock contention on free(). So you poison the object to detect bugs, possibly only in certain configurations, and then queue the pointer for deferred freeing on a single thread at a better time.

(Ok, this doesn't quite do it: poisoning is much more likely to use memset than memcpy, but I assume gcc would optimize out a doomed memset too?)


Yes, it potentially could be optimised out, which is why platforms provide functions like SecureZeroMemory() for cases where you want to be sure that memory is zeroed out.


That would be why I introduced an explicit_memset() into the OpenZFS encryption module in the commit that I linked. It uses two different techniques to guard against the compiler deleting it.


The valid inputs to memcpy() are defined by the C specification, so the compiler is free to make assumptions about what valid inputs are even if the library implementation chooses to allow a broader range of inputs


Per ISO C, the identifiers declared or defined with external linkage by any C standard library header are considered reserved, so the moment you define your own memcpy, you're already in UB land.


Many standard C functions are treated as “magic” by compilers. Malloc is treated as if it has no side effects (which of course it does, it changes allocator state) so the optimiser can elide allocations. If not you wouldn’t be able to elide the call because malloc looks like it has side effects, which it does but not ones we care about observing.


Not only that, malloc is also assumed to return pointer that don't alias anything else.


If I'm understanding the OP correctly, the C standard says so, i.e. the semantics of memcpy are defined by the standard and the standard says that it's UB to pass NULL.


Unlike all the more complicated languages the "freestanding" mode C doesn't even have a memcpy feature, so it may not define how one works - maybe you've decided to use the name "memcpy" for your function which generates a memorandum about large South American rodents, and "memo_capybara" was too much typing.

In something like C++ or Rust, even their bare metal "What do you mean Operating System?" modes quietly require memcpy and so on because we're not savages, clearly somebody should provide a way to copy bytes of memory, Rust is so civilised that even on bare metal (in Rust's "core" library) you get a working sort_unstable() for your arbitrary slice types!


The compiler is free to give a meaning to memcpy if run in the (default) hosted mode. There's -ffreestanding for freestanding environments.


Right, though I guess I wasn't clear enough about that for the down voters, but whatever.


If you do so you have to add -fno-builtins (or just -fno-builtin-memcpy).




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: