Making memcpy(NULL, NULL, 0) well-defined

whytevuhuni · 2024-12-11T12:43:48 1733921028

How interesting. GCC does indeed remove that branch.

https://godbolt.org/z/aPcr1bfPe

ndesaulniers · 2024-12-11T17:51:05 1733939465

> For example, GCC will happily remove the dest == NULL branch in the following code

I think the blog should mention `-fno-delete-null-pointer-checks`

https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html#ind...

AceJohnny2 · 2024-12-11T21:17:22 1733951842

> -fdelete-null-pointer-checks

> [...]

> This option is enabled by default on most targets.

What a footgun.

I understand that, in an effort to compete with other compilers for relevance, GCC pursued performance over safety. Has that era passed? Could GCC choose safer over fast?

Alternatively, has someone compiled a list of flags one might want to enable in latest GCC to avoid such kinds of dangerous optimizations?

comex · 2024-12-11T22:48:26 1733957306

Just for the record, that's not the main purpose of -fdelete-null-pointer-checks.

Normally, it only deletes null checks after actual null pointer dereferences. In principle this can't change observable behavior. Null dereferences are guaranteed to trap, so if you don't trap, it means the pointer wasn't null. In other words, unlike most C compiler optimizations, -fdelete-null-pointer-checks should be safe even if you do commit undefined behavior.

This once caused a kerfuffle with the Linux kernel. At the time, x86_64 CPUs allowed the kernel to dereference userspace addresses, and the kernel allowed userspace to map address 0. Therefore, it was possible for userspace to arrange for null pointers to not trap when dereferenced in the kernel. Which meant that the null check optimization could actually change observable behavior. Which introduced a security vulnerability. [1]

Since then, Linux has been compiled with `-fno-delete-null-pointer-checks`, but it's not really necessary: Linux systems have long since enforced that userspace can't map address 0, which means that deleting null pointer checks should be safe in both kernel and userspace. (Newer CPU security features also protect the kernel even if userspace is allowed to map address 0.)

But anyway, I didn't know that -fdelete-null-pointer-checks treated "memcpy with potentially-zero size" as a condition to remove subsequent null pointer checks. That means that the optimization actually isn't safe! Once GCC is updated to respect the newly well-defined behavior, though, it should become truly safe. Probably.

The same can't be said for most UB optimizations – most of which can't be turned off.

[1] https://lwn.net/Articles/342330/

robinsonb5 · 2024-12-11T23:06:41 1733958401

> Null dereferences are guaranteed to trap, so if you don't trap, it means the pointer wasn't null.

dfe · 2024-12-12T06:18:46 1733984326

I once spent hours if not days debugging a problem with some code I had recently written because of this exact optimization.

It wasn't an embedded system, but rather an x86 BIOS boot loader, which is sort of halfway there. Protected mode enabled without paging, so there's nothing to trap a NULL.

Completely by accident I had dereferenced a pointer before doing a NULL check. I think the dereference was just printing some integer, which of course had a perfectly sane-looking value so I didn't even think about it.

The compiler, I can't remember if it was gcc or clang by this point, decided that since I had already successfully dereferenced the pointer it could just elide the null check and the code path associated with it.

Finally I ran it in VMware and attached a debugger, which skipped right over the null check even though I could see in the debugger the value was null. So then I went to look at the assembly the compiler generated, and that's when I started to understand what had happened.

It was a head-slapper when I found the dereference above. I added a second null check or moved that code or some such, and that was it.

pjmlp · 2024-12-12T07:21:06 1733988066

Now map the hours and days spent into actual money, being taken from project budget, and then you realise why some business prefer some languages over others.

rcxdude · 2024-12-12T20:19:50 1734034790

There was a more egregoius one which got Linus further pissed off with GCC, which was due to a 'dereference' that would not trap, but still deleted a later null check (because e.g. int *foo = &bar->baz is basically just calculating an offset to bar, and so will not fail at runtime, but it is still a dereference according to the abstract machine and so is undefined if bar is NULL). I think the risk of something like that is why it's still disabled.

ryao · 2024-12-11T21:34:24 1733952864

Usually, when one marks an argument as nonnull via a function attribute, one wants NULL checks to be removed.

ndesaulniers · 2024-12-11T21:54:27 1733954067

There's two similar but distinct function attributes for nullability. One affects codegen, one affects diagnostics only.

ryao · 2024-12-11T23:09:02 1733958542

Which are those? I only know about nonnull, nonnull_if_nonzero and returns_nonnull:

https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attribute...

AceJohnny2 · 2024-12-11T22:44:05 1733957045

Irrelevant, because delete-null-pointer-checks happens even in absence of nonnull function attribute, see GP's godbolt link, and the documentation that omits any reference to that function attribute.

That's what makes it dangerous!

ryao · 2024-12-11T22:56:54 1733957814

That is a side effect of passing the pointer as a function parameter marked nonnull. It implies that the pointer is nonnull and any NULL checks against it can be removed. Pass it to a normal function and you will not see the NULL check removed.

mjg59 · 2024-12-11T15:28:47 1733930927

Explanation for the above: passing NULL as the destination argument to memcpy() is undefined behaviour at present. gcc assumes that the fact that memcpy() is called therefore means that the destination argument can't be NULL, so "knows" that the dest == NULL check can never be true, and so removes the test and the do_thing1() branch entirely.

Interestingly, replacing len in the memcpy() call results in gcc instead removing the memcpy() call and retaining the check - presumably a different optimisation routine decides that it's a no-op in that case. https://godbolt.org/z/cPdx6v13r is, therefore, interesting - despite this only ever calling test() with a len of 0, the elision of the dest == NULL check is still there, but test() has been inlined without the memcpy (because len == 0) but with do_thing2() (because the behaviour is undefined and so it can assume dest isn't NULL even though there's a NULL literally right there!)

Fucking compilers, man.

jpollock · 2024-12-11T17:36:33 1733938593

How does gcc infer anything about memcpy? Can't I replace the c-library memcpy with my own, so how does it know that dest == NULL can never be true?

ryao · 2024-12-11T20:52:55 1733950375

You can, but gcc may replace it with an equivalent set of instructions as a compiler optimization, so you would have no guarantee it is used unless you hack the compiler.

On a related note, GCC optimizing away things is a problem for memset when zeroing buffers containing sensitive data, as GCC can often tell that the buffers are going to be freed and thus the write is deemed unnecessary. That is a security issue and has to be resolved by breaking the compiler’s optimization through a clever trick:

https://github.com/openzfs/zfs/commit/d634d20d1be31dfa8cf06e... 12352

Similarly, GCC may delete a memcpy to a buffer about to be freed, although I have never observed that as you generally don’t do that in production code.

sfink · 2024-12-11T23:36:52 1733960212

> Similarly, GCC may delete a memcpy to a buffer about to be freed, although I have never observed that as you generally don’t do that in production code.

It's not that crazy. You could have a refcounted object that poisons itself when the refcount drops to zero, but doesn't immediately free itself because many malloc implementations can have bad lock contention on free(). So you poison the object to detect bugs, possibly only in certain configurations, and then queue the pointer for deferred freeing on a single thread at a better time.

(Ok, this doesn't quite do it: poisoning is much more likely to use memset than memcpy, but I assume gcc would optimize out a doomed memset too?)

Chaosvex · 2024-12-12T01:01:04 1733965264

Yes, it potentially could be optimised out, which is why platforms provide functions like SecureZeroMemory() for cases where you want to be sure that memory is zeroed out.

ryao · 2024-12-12T01:32:29 1733967149

That would be why I introduced an explicit_memset() into the OpenZFS encryption module in the commit that I linked. It uses two different techniques to guard against the compiler deleting it.

mjg59 · 2024-12-11T17:59:02 1733939942

The valid inputs to memcpy() are defined by the C specification, so the compiler is free to make assumptions about what valid inputs are even if the library implementation chooses to allow a broader range of inputs

int_19h · 2024-12-11T20:29:27 1733948967

Per ISO C, the identifiers declared or defined with external linkage by any C standard library header are considered reserved, so the moment you define your own memcpy, you're already in UB land.

MindSpunk · 2024-12-11T20:16:35 1733948195

Many standard C functions are treated as “magic” by compilers. Malloc is treated as if it has no side effects (which of course it does, it changes allocator state) so the optimiser can elide allocations. If not you wouldn’t be able to elide the call because malloc looks like it has side effects, which it does but not ones we care about observing.

gpderetta · 2024-12-11T20:58:09 1733950689

Not only that, malloc is also assumed to return pointer that don't alias anything else.

0xffff2 · 2024-12-11T17:58:54 1733939934

If I'm understanding the OP correctly, the C standard says so, i.e. the semantics of memcpy are defined by the standard and the standard says that it's UB to pass NULL.

tialaramex · 2024-12-11T20:35:34 1733949334

Unlike all the more complicated languages the "freestanding" mode C doesn't even have a memcpy feature, so it may not define how one works - maybe you've decided to use the name "memcpy" for your function which generates a memorandum about large South American rodents, and "memo_capybara" was too much typing.

In something like C++ or Rust, even their bare metal "What do you mean Operating System?" modes quietly require memcpy and so on because we're not savages, clearly somebody should provide a way to copy bytes of memory, Rust is so civilised that even on bare metal (in Rust's "core" library) you get a working sort_unstable() for your arbitrary slice types!

bonzini · 2024-12-11T21:11:56 1733951516

The compiler is free to give a meaning to memcpy if run in the (default) hosted mode. There's -ffreestanding for freestanding environments.

tialaramex · 2024-12-11T22:13:16 1733955196

Right, though I guess I wasn't clear enough about that for the down voters, but whatever.

bonzini · 2024-12-11T21:10:25 1733951425

If you do so you have to add -fno-builtins (or just -fno-builtin-memcpy).

mpweiher · 2024-12-12T14:44:51 1734014691

> that memcpy() is called therefore means that the destination argument can't be NULL

The whole idea that undefined behavior cannot happen and you can therefore do optimization based on "knowing" it cannot happen is incredibly bonkers.

UncleMeat · 2024-12-12T15:07:14 1734016034

Imagine this program.

   int foo() {
     int x = 1;
     havoc();
     return x;
   }

Can this function be compiled to store x in a register? Can it be compiled to remove x entirely and return the constant 1? That relies on "knowing that undefined behavior cannot happen." This program will behave differently if we store x on the stack and then return it after we call havoc() than if we call havoc() and then return the constant 1, if havoc() just writes to out of bounds memory addresses or whatever.

In this case the undefined behavior just feels "more extreme" to most people, but it is remarkably hard for people to rigorously define the undefined behavior that should and should not be considered when making optimizations.

mpweiher · 2024-12-12T23:29:33 1734046173

> That relies on "knowing that undefined behavior cannot happen."

No it doesn't.

UncleMeat · 2024-12-13T02:42:17 1734057737

Yes it does. The optimizing this to return the constant 1 is not producing an equivalent program unless we make assumptions about the behavioral bounds of havoc().

mpweiher · 2024-12-13T17:50:19 1734112219

That’s not at all the same thing.

UncleMeat · 2024-12-13T19:16:22 1734117382

What is the difference between "writing past the end of an array is UB" and "dereferencing a null pointer is UB" and "passing null as the destination argument to memcpy is UB"? The two programs I listed above are only observationally equivalent if writing past the end of valid allocations is UB.

A core problem with this discussion in almost all circumstances is that people have a vibe for which of these things it feels okay for a compiler to make logical deductions from and which it feels not okay but if you actually sit down and try to formalize this in a way that would be meaningful to compiler vendors, you can't.

mpweiher · 2024-12-19T17:24:46 1734629086

You are still completely missing the point.

This example is not "I know that UB doesn't happen, therefore ...", which is what the memcpy() case is.

It is "I don't care that UB might happen, I am going to act as if it didn't. If the UB then makes the program behave differently than without the UB, that's not my problem".

Which, incidentally, is one of the suggested/permitted responses to UB in the standards text (that was made non-binding).

nayuki · 2024-12-11T17:19:55 1733937595

> Fucking compilers, man.

They're just acting as agents that derive the logical consequences of the code.

The fact that the given example code is "surprising" is analogous to this mathematical derivation:

    a = b
    a*a = b*a
    a*a - b*b = b*a - b*b
    (a - b)(a + b) = b(a - b)
    (a - b)(a + b)/(a - b) = b(a - b)/(a - b)
    ^ Divide by 0, undefined behavior!
    Everything below is not necessarily true.
    a + b = b
    b + b = b
    2b = b
    2 = 1
    2 - 1 = 1 - 1
    1 = 0

The source of truth about what is/isn't allowed is the C standard, not your personal simplified model of it that may contain dangerous misconceptions. The fact that your mental model doesn't match the document is an education problem, not a problem with the compiler.

marssaxman · 2024-12-11T17:35:08 1733938508

> They're just acting as agents that derive the logical consequences of the code.

In a particularly pedantic, uptight, and sometimes un-helpful way, yes.

Compilers don't have to be designed this way; in fact it is a relatively recent development in the history of such tools.

saurik · 2024-12-11T18:47:15 1733942835

> The fact that your mental model doesn't match the document is an education problem, not a problem with the compiler.

Or it is a problem with the document, which is the entire reason we are having this discussion: N3322 argued the document should be fixed, and now it will be for C2y.

badmintonbaseba · 2024-12-11T14:20:37 1733926837

I just skimmed through the proposed wording in [N3322]. It looks like it silently fixes a defect too, NULL == NULL was also undefined up until C23. Hilarious.

[N3322] https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3322.pdf

mananaysiempre · 2024-12-11T14:54:03 1733928843

This is probably related to the issue with NULL - NULL mentioned in the article.

Imagine you’re working in real mode on x86, in the compact or large memory model[1]. This means that a data pointer is basically struct{uint16_t off,seg;} encoding linear address (seg<<4)+off. This makes it annoying to have individual allocations (“objects”) >64K in size (because of the weird carries), so these models don’t allow that. (The huge model does, and it’s significantly slower.) Thus you legitimately have sizeof(size_t) == 2 but sizeof(uintptr_t) == 4 (hi Rust), and God help you if you compare or subtract pointers not within the same allocation. [Also, sizeof(void *) == 4 but sizeof(void (*)(void)) == 2 in the compact model, and the other way around in the medium model.]

Note the addressing scheme is non-bijective. The C standard is generally careful not to require the implementation to canonicalize pointers: if, say, char a[16] happens to be immediately followed by int b[8], an independently declared variable, it may well be that &a+16 (legal “one past” pointer) is {16,1} but &b is {0,2}, which refers to the exact same byte, but the compiler doesn’t have to do anything special because dereferencing &a+16 is UB (duh) and comparing (char *)(&a+16) with (char *)&b or subtracting one from the other is also UB (pointers to different objects).

The issue with NULL == NULL and also with NULL - NULL is that now the null pointer is required to be canonical, or these expressions must canonicalize their operands. I don’t know why you’d ever make an implementation that has non-canonical NULLs, but I guess the text prior to this change allowed such.

[1] https://devblogs.microsoft.com/oldnewthing/20200728-00/?p=10...

amluto · 2024-12-11T17:03:54 1733936634

> now the null pointer is required to be canonical

Yikes! This particular oddity seems annoying but sort of harmless in x86 real mode, but not necessarily in protected mode. Imagine code that wants to load a pointer into a register: it loads the offset into an ordinary register and the selector portion into a segment register. It’s permissible to load the 0 (null) selector, but loading garbage will fault immediately. So, if you allow non canonical NULL, then knowing that a pointer is either valid or NULL does not allow you to hoist a segment load above a condition that might mean you never actually dereference the pointer.

(I have plenty of experience with low-level OS code in all kinds of nasty x86 modes but, thankfully, not so much experience writing ordinary C code targeting protected mode. It sometimes boggles my mind that anyone ever got decent performance with anything involving far data pointers. Segment loads are slow, and there are not a lot of segment registers to go around.)

bonzini · 2024-12-11T21:17:14 1733951834

In real mode assembly days, ES and sometimes DS were just another base register that you could use in a loop. Given the dearth of addressing modes it was quite nice to assume that large arrays started at xxxx0h and therefore that the offset part of the far pointer was zero.

pm215 · 2024-12-11T15:10:25 1733929825

If so, it's one that's been introduced at some point post C99 -- the C99 spec explicitly defines the behaviour of NULL == NULL. Section 6.5.9 para 6 says "Two pointers compare equal if and only if both are null pointers, both are pointers to the same object [etc etc]".

dwattttt · 2024-12-11T20:15:42 1733948142

I don't imagine NULL is defined as "pointing to an object", so I don't expect that clause to apply.

tsimionescu · 2024-12-11T20:54:21 1733950461

You completely skipped over the first part: "Two pointers compare equal if and only if both are null pointers"

lelanthran · 2024-12-12T08:35:57 1733992557

> You completely skipped over the first part: "Two pointers compare equal if and only if both are null pointers"

Maybe he elided it in an optimisation pass?

dwattttt · 2024-12-11T23:58:55 1733961535

Can't get much more of a reading comprehension failure than that. Good thing I don't write compilers.

tsimionescu · 2024-12-12T06:56:03 1733986563

Happens to everyone, don't worry, especially when trying to focus on details, ironically...

nikic · 2024-12-11T15:52:41 1733932361

NULL == NULL was already defined -- but NULL <= NULL wasn't :)

badmintonbaseba · 2024-12-12T09:39:17 1733996357

My mistake.

IWeldMelons · 2024-12-11T16:22:18 1733934138

Cannot find any confirmation to your statement. Otoh "All null pointer values (of compatible typewithin the same address space) are already required to compare equal. " in the limked paper.

PaulDavisThe1st · 2024-12-11T22:05:52 1733954752

NULL is not single type in any conventional sense (and is actually tricky to define in a way that makes it usable in the way most programmers expect).

Thus:

  T1* a = NULL;
  T2* b = NULL
  a == b; /* may be undefined at present, depending on the nature of T1 & T2 */

IWeldMelons · 2024-12-12T09:31:45 1733995905

"NULL" in fact is a macro, not a part of the language. null (zero pointer) is, and it is explicitly defined in standard, that comparison of two null pointers lead to equality. You example simply won't compile, it is not undefined; the pointers simply are of different type, period.

here what standard says:

"A pointer to void may be converted to or from a pointer to any object type.

Conversion of a null pointer to another pointer type yields a null pointer of that type. Any two null pointers shall compare equal."

therefore, convert any of them or both to void amd compare. you'll get equality.

voidUpdate · 2024-12-11T12:33:54 1733920434

I feel like I've misunderstood something here... shouldn't memcpy(anything, anything, 0) just do nothing, because you're copying 0 bytes?

mjg59 · 2024-12-11T12:37:26 1733920646

That's a reasonable intuitive interpretation of how it should behave, but according to the spec it's undefined behaviour and compilers have a great degree of freedom in what happens as a result.

david-gpu · 2024-12-11T13:21:01 1733923261

More information on this behavior in the link below.

> Note that, apart from contrived examples with deleted null checks, the current rules do not actually help the compiler meaningfully optimize code. A memcpy implementation cannot rely on pointer validity to speculatively read because, even though memcpy(NULL, NULL, 0) is undefined, slices at the end of a buffer are fine. [And if the end of the buffer] were at the end of a page with nothing allocated afterwards, a speculative read from memcpy would break

https://davidben.net/2024/01/15/empty-slices.html

Someone · 2024-12-11T13:57:12 1733925432

> [And if the end of the buffer] were at the end of a page with nothing allocated afterwards, a speculative read from memcpy would break

‘Only’ on platforms that have memory protection hardware. Even there, the platform can always allocate an overflow page for a process, or have the page fault handler check whether the page fault happened due to a speculative read, and repair things (I think the latter is hugely, hugely, hugely impractical, but the standard cannot rule it out)

immibis · 2024-12-11T14:14:18 1733926458

Platforms without memory protection hardware also have no problem reading NULL.

Someone · 2024-12-11T14:18:55 1733926735

My comment is a reply to (part of) a comment that isn’t talking about reading from NULL. That’s what the [And if the end of the buffer] part implies.

Even if it didn’t, I don’t think the standard should assume that “Platforms without memory protection hardware also have no problem reading NULL”

An OS could, for example, have a very simple memory protection feature where the bottom half of the memory address range is reserved for the OS, the top half for user processes, and any read from an address with the high bit clear by code in the top half of the address range traps and makes the OS kill the process doing the read.

BenjiWiebe · 2024-12-11T15:06:37 1733929597

Doesn't it take memory protection hardware to trap on a memory read?

lmm · 2024-12-12T03:05:09 1733972709

As a philosophical matter, by definition that would be memory protection hardware, sure. But the point is that it's at least conceivable that some platforms might have some crude, hardwired memory protection without having a full MMU.

kevin_thibedeau · 2024-12-11T17:43:14 1733938994

They may also expect writes to address 0.

hun3 · 2024-12-11T15:05:47 1733929547

Not really. MMIO mapped at 0x0 for example.

david-gpu · 2024-12-11T16:48:51 1733935731

Yikes! I would love sipping coffee watching the chief architect chew up whoever suggested that. That sounds awful even on a microcontroller.

bonzini · 2024-12-11T21:23:15 1733952195

On s390 the memory at address 0 (low core) has all sorts of important stuff. Of course s390 has paging enabled pretty much always but still...

colejohnson66 · 2024-12-11T22:14:02 1733955242

AVR’s registers are mapped to address 0. So reading and writing NULL is actually modifying r0.

formerly_proven · 2024-12-11T22:30:57 1733956257

AVR’s r0 is also a totally normal register, unlike most other RISC which typically have r0 == 0.

david-gpu · 2024-12-12T01:06:12 1733965572

Thanks for saving me a search, because I was expecting r0 to be hardcoded to zero.

Sometimes hardware is designed with insufficient input from software folks and the result is something asinine like that. That, or some people like watching the world burn.

Zondartul · 2024-12-11T17:52:58 1733939578

What does "speculative" mean in this case? I understand it as CPU-level speculative execution a.k.a. branch mis-prediction, but that shouldn't have any real-world effects (or else we'd have segfaults all the time due to executing code that didn't really happen)

dwattttt · 2024-12-11T20:55:00 1733950500

Turns out you can have that kind of speculative failure too! https://randomascii.wordpress.com/2018/01/07/finding-a-cpu-d...

voidUpdate · 2024-12-11T12:41:38 1733920898

Why didn't they just... define it, back when they wrote it?

larschdk · 2024-12-11T13:19:41 1733923181

When C was conceived, CPU architectures and platforms were more varied than what we see today. In order to remain portable and yet performant, some details were left as either implementation defined, or completely undefined (i.e. the responsibility of the programmer). Seems archaic today, but it was necessary when C compilers had to be two-pass and run in mere kilobytes of RAM. Even warnings for risky and undefined behavior is a relatively modern concept (last 10-20 years) compared to the age of C.

actionfromafar · 2024-12-11T13:25:25 1733923525

When C was conceived, it was made for a specific DEC CPU, for making an operating system. The idea of a C standard was in the future.

If you wanted to know what (for instance) memcpy actually did, you looked at the source code, or even more likely, the assembler or machine code output. That was "the standard".

da_chicken · 2024-12-11T17:20:32 1733937632

I think it's reasonable to assume that GP clearly meant the C standard being conceived, as, obviously, K&R's C implementation of the language was ad hoc rather than exhibiting any prescribed specification.

anticensor · 2024-12-11T15:51:09 1733932269

No, K&R's book was the standard.

actionfromafar · 2024-12-11T16:27:44 1733934464

First came the language, then a few years later they described it in a book.

scoutt · 2024-12-11T15:43:00 1733931780

> Seems archaic today ... run in mere kilobytes of RAM

There is an entire industry that does pretty much that... today. They might run in flash instead of RAM, but still, a few kilobytes.

Probably there are more embedded devices out there than PCs. PIC, AVR, MSP, ARM, custom archs. There might be one of those right now under your hand, in that thing you use to move the cursor.

krisoft · 2024-12-11T16:04:25 1733933065

> There is an entire industry that does pretty much that... today.

Which industry runs C compilers on embeded devices? Because that is what the part you elipsised out was talking about.

scoutt · 2024-12-11T16:17:01 1733933821

Oh... yes. You are right. My bad.

sitzkrieg · 2024-12-11T17:52:56 1733939576

many do tho. i have targetted c89 and maybe c99 on several embedded devices

vlovich123 · 2024-12-11T18:06:42 1733940402

They cross compile. No one is compiling code on these machines.

0xffff2 · 2024-12-11T18:06:15 1733940375

But you're running the compiler on the device rather than cross-compile?

Narishma · 2024-12-12T02:31:52 1733970712

I doubt you're running C compilers on those devices.

killerstorm · 2024-12-11T13:54:28 1733925268

From what I understand:

1. Initially, they just wanted to give compiler makers more freedom: both in the sense "do whatever is simplest" and "do something platform-specific which dev wants". 2. Compiler devs found that they can use UB for optimization: e.g. if we assume that a branch with UB is unreachable we can generate more efficient code. 3. Sadly, compiler devs started to exploit every opportunity for optimization, e.g. removing code with a potential segfault.

I.e. people who made a standard thought that compiler would remove no-op call to memcpy, but GCC removes the whole branch which makes the call as it considers the whole branch impossible. Standard makers thought that compiler devs would be more reasonable

kllrnohj · 2024-12-11T14:35:21 1733927721

> Standard makers thought that compiler devs would be more reasonable

This is a bit of a terrible take? Compiler devs never did anything "unreasonable", they didn't sit down and go "mwahahaha we can exploit the heck out of UB to break everything!!!!"

Rather, repeatedly applying a series of targeted optimizations, each one in isolation being "reasonable", results in an eventual "unreasonable" total transformation. But this is more an emergent property of modern compilers having hundreds of optimization passes.

At the time the standards were created, the idea of compilers applying so many optimization passes was just not conceivable. Compilers struggled to just do basic compilation. The assumption was a near 1:1 mapping between code & assembly, and that just didn't age well at all.

LegionMammal978 · 2024-12-11T15:58:17 1733932697

One could argue that "optimizing based on signed overflow" was an unreasonable step to take, since any given platform will have some sane, consistent behavior when the underlying instructions cause an overflow. A developer using signed operations without poring over the standard might have easily expected incorrect values (or maybe a trap if the platform likes to use those), but not big changes in control flow. In my experience, signed overflow is generally the biggest cause of "they're putting UB in my reasonable C code!", followed by the rules against type punning, which are violated every day by ordinary usage of the POSIX socket functions.

kllrnohj · 2024-12-11T17:12:03 1733937123

> One could argue that "optimizing based on signed overflow" was an unreasonable step to take

That optimization allows using 64-bit registers / offset loads for signed ints which it can't do if it has to overflow, since that overflow must happen at 32-bits. That's not an uncommon thing.

uecker · 2024-12-11T17:16:39 1733937399

I started to like signed overflow rules, because it is really easy to find problems using sanitizers.

The strict aliasing rules are not violated by typical POSIX socket code as a cast to a different pointer type, i.e. `struct sockaddr` by itself is well-defined behavior. (and POSIX could of course just define something even if ISO C leaves it undefined, but I don't think this is needed here)

LegionMammal978 · 2024-12-14T05:15:51 1734153351

> The strict aliasing rules are not violated by typical POSIX socket code as a cast to a different pointer type, i.e. `struct sockaddr` by itself is well-defined behavior.

Basically all usage of sendmsg() and recvmsg() with a static char[N] buffer is UB, is one big example I've run into. Unless you memcpy every value into and out of the buffer, which literally no one does. Also, reading sa_family from the output of accept() (or putting it into a struct sockaddr_storage and reading ss_family) is UB, unless you memcpy it out, which literally no one does.

uecker · 2024-12-14T20:45:30 1734209130

Using a static char buffer would indeed UB but we just made the change to C2Y that this ok (and in practice it always was). Incorrect use of sockaddr_storage may lead to UB. But again, most socket code I see is actually correct.

lmm · 2024-12-12T03:10:36 1733973036

> Compiler devs never did anything "unreasonable", they didn't sit down and go "mwahahaha we can exploit the heck out of UB to break everything!!!!"

Many compiler devs are on record gleefully responding to bug reports with statements on the lines of "your code has undefined behaviour according to the standard, we can do what we like with it, if you don't like it write better code". Less so in recent years as they've realised this was a bad idea or at least a bad look, but in the '00s it was a normal part of the culture.

killerstorm · 2024-12-12T09:25:28 1733995528

What stops compiler makers from treating UB as platform-specific behavior rather than as something which cannot happen?

"You are not allowed to do this, and thus..." reasoning assumes that programmers are language lawyers, which is unreasonable.

kllrnohj · 2024-12-12T18:40:14 1734028814

    bool foo(some_struct* bar) {
        if (bar->blah()) {
            return true;
        }
        if (bar == nullptr) {
            return false;
        }
        return true;
    }

Can the compiler eliminate that nullptr comparison in your opinion yes or no? While this example looks stupid, after inlining it's quite plausible to end up with code in this type of a pattern. Dereferencing a nullptr is UB, and typically the "platform-specific" behavior is a crash, so... why should that if statement remain? And then if it can't remain, why should an explicit `_Nonnull` assertion have different behavior than an explicit deref? What if the compiler can also independently prove that some_struct->blah() always evaluates to false, so it eliminates that entire branch - does the `if (bar == nullptr)` still need to remain in that specific case? If so, why? The code was the same in both cases, the compiler just got better at eliminating dead code.

UncleMeat · 2024-12-11T15:44:46 1733931886

There isn't a "find UB branches" pass that is seeking out this stuff.

Instead what happens is that you have something like a constant folding or value constraint pass that computes a set of possible values that a variable can hold at various program points by applying constraints of various options. Then you have a dead code elimination pass that identifies dead branches. This pass doesn't know why the "dest" variable can't hold the NULL value at the branch. It just knows that it can't, so it kills the branch.

Imagine the following code:

   int x = abs(get_int());
   if (x < 0) {
     // do stuff
   }

Can the compiler eliminate the branch? Of course. All that's happened here is that the constraint propagation feels "reasonable" to you in this case and "unreasonable" to you in the memcpy case.

meonukk · 2024-12-11T16:26:48 1733934408

Why is it allowed to eliminate the branch? In most architectures abs(INT_MIN) returns INT_MIN which is negative

plorkyeran · 2024-12-11T17:09:32 1733936972

Calling abs(INT_MIN) on twos-complement machine is not allowed by the C standard. The behavior of abs() is undefined if the result would not fit in the return value.

ryao · 2024-12-12T06:10:22 1733983822

Where does it say that? I thought this was a famous example from formal methods showing why something really simple could be wrong. It would be strange for the standard to say to ignore it. The behavior is also well defined in two’s complement. People just don’t like it.

plorkyeran · 2024-12-13T22:12:12 1734127932

https://busybox.net/~landley/c99-draft.html#7.20.6.1

"The abs, labs, and llabs functions compute the absolute value of an integer j. If the result cannot be represented, the behavior is undefined. (242)"

242 The absolute value of the most negative number cannot be represented in two's complement.

Sohcahtoa82 · 2024-12-11T17:30:47 1733938247

I didn't believe this so I looked it up, and yup.

Because of 2's complement limitations, abs(INT_MIN) can't actually be represented and it ends up returning INT_MIN.

UncleMeat · 2024-12-11T21:08:20 1733951300

It's possible that there is an edge case in the output bounds here. I'm just using it as an example.

Replace it with "int x = foo() ? 1 : 2;" if you want.

robinsonb5 · 2024-12-11T23:42:48 1733960568

> value constraint pass that computes a set of possible values that a variable can hold

Surely that value constraint pass must be using reasoning based on UB in order to remove NULL from the set of possible values?

Being able to disable all such reasoning, then comparing the generated code with and without it enabled would be an excellent way to find UB-related bugs.

UncleMeat · 2024-12-12T14:53:46 1734015226

There are many such constraints, and often ones that you want.

"These two pointers returned from subsequent calls to malloc cannot alias" is a value constraint that relies on UB. You are going to have a bad time if your compiler can't assume this to be true and comparing two compilations with and without this assumption won't be useful to you as a developer.

There are a handful of cases that people do seem to look at and say "this one smells funny to me", even if we cannot articulate some formal reason why it feels okay for the compiler to build logical conclusions from one assumption and not another. Eliminating null checks that are "dead" because they are dominated by some operation that is illegal if performed on null is the most widely expressed example. Eliminating signed integral bounds checks by assuming that arithmetic operations are non-overflowing is another. Some compilers support explicitly disabling some (but not all) optimizations derived from deductions from these assumptions.

But if you generalize this to all UB you probably won't end up with what you actually want.

mjevans · 2024-12-11T19:39:28 1733945968

More reasonable: Emit a warning or error to make the code and human writing it better.

NOT-reasonable: silently 'optimize' a 'gotcha' into behavior the programmer(s) didn't intend.

gpderetta · 2024-12-11T21:08:43 1733951323

NOT-reasonable: expecting the compiler to read the programmer's mind.

mjevans · 2024-12-12T06:28:39 1733984919

OK, you want a FORMAL version?

Acceptable UB: Do the exact same type of operation as for defined behavior, even if the result is defined by how the underlying hardware works.

NOT-acceptable UB: Perform some operation OTHER than the same as if it were the valid code path, EXCEPT: Failure to compile or a warning message stating which code has been transformed into what other operation as a result of UB.

gpderetta · 2024-12-12T10:03:05 1733997785

I don't understand, if the operation is not defined, what exactly the compiler should do?

If I tell you "open the door", that implies that the door is there. If the door is not there, how would you still open the door?

Concretely, what do you expect this to return:

  #include <cstddef>
  void sink(ptrdiff_t);
  ptrdiff_t source();

  int foo() {    
    int x = 1;
    int y;
    sink(&y-&x);
    *(&y - source()) = 42;
    return x;
  }

assuming that source() returns the parameter passed to sink()?

Incidentally I had to launder the offset through sink/source, because GCC has a must-alias oracle to mitigate miscompiling some UB code, so in a way it already caters to you.

mjevans · 2024-12-12T23:39:13 1734046753

Evaluated step by step...

Offhand, *sink(&y-&x);* the compiler is not _required_ to lay out variables adjacently. So the computation of the pointers fed to sink does not have to be defined and might not be portable.

It would be permissible for the compiler to refuse to compile that ('line blah, op blah' does not conform the the standard's allowed range of behavior).

It would also be permissible to just allow that operation to happen. It's the difference of two pointer sized units being passed. That's the operation the programmer wrote, that's the operation that will happen. Do not verify bounds or alter behavior because the compiler could calculate that the value happens to be PTRMAX-sizeof(int)+1 (it placed X and Y in reverse of how a naive assumption might assume).

The = 42 line might write to any random address in memory. Again, just compile the code to perform the operation. If that happens to write 42 somewhere in the stack frame that leads to the program corrupting / a segfault that's fine. If the compiler says 'wait that's not a known memory location' or 'that's going to write onto the protected stack!' it can ALSO refuse to compile and say why that code is not valid.

I would expect valid results to be a return of: 42, 1 (possibly with a warning message about undefined operations and the affected lines), OR the program does not compile and there is an error message which says what's wrong.

gpderetta · 2024-12-13T15:24:39 1734103479

&y-&x doesn't require the variables to adjacent, just to exist in the same linear address space. It doesn't even imply any specific ordering .

> Again, just compile the code to perform the operation. If that happens to write 42 somewhere in the stack frame that leads to the program corrupting / a segfault that's fine. If the compiler says 'wait that's not a known memory location' or 'that's going to write onto the protected stack!

As far as the compiler is concerned, source() could return 0 and the line be perfectly defined, so there is no reason to produce an error. In fact as far as the compiler is concerned 0 is the only valid value that source could return, so that line can only be writing to y. As that variable is a local variable that going out of scope, the compiler omits the store. Or you also believe that dead store elimination is wrong?

> possibly with a warning message about undefined operations and the affected lines

There is no definitely undefined operation in my example; there can be UB depending on the behaviour of externally compiled functions, but that's true of almost any C++ statement.

What most people in the "compiler must warn about UB" camp fail to realize, is that 99.99% of the time the complier has no way of realizing some code is likely to cause UB: From the compiler point of view my example is perfectly standard compliant [1], UB comes only from the behaviour of source and sink that are not analysable by the compiler.

[1] technically to be fully conforming the code should cast the pointers to uintptr_t before doing the subtraction.

mjevans · 2024-12-14T05:16:50 1734153410

I'm not familiar with the stack-like functions mentioned, but that is indeed something it should NOT eliminate.

In fact, the compiler should not eliminate 'dead stores'. That should be a warning (and emit the code) OR an error (do not emit a program).

The compiler should inform the programmer so the PROGRAM can be made correct. Not so it's particular result can be faster.

menaerus · 2024-12-11T13:36:23 1733924183

Charitable interpretation may be: Back then when the contract of this function was standardized, presumably in C89 which is ~35 years ago, CPUs but also C compilers were not as powerful so wasting an extra couple of CPU cycles to check this condition was much more expensive than it is today. Because of that contract, and which can be seen in the example in the below comments, the compiler is also free to eliminate the dead code which also has the effect of shaving off some extra CPU cycles.

ynik · 2024-12-11T15:00:02 1733929202

Probably because they did not think of this special case when writing the standard, or did not find it important enough to consider complicating the standard text for.

In C89, there's just a general provision for all standard library functions:

> Each of the following statements applies unless explicitly stated otherwise in the detailed descriptions that follow. If an argument to a function has an invalid value (such as a value outside the domain of the function, or a pointer outside the address space of the program, or a null pointer), the behavior is undefined. [...]

And then there isn't anything on `memcpy` that would explicitly state otherwise. Later versions of the standard explicitly clarified that this requirement applies even to size 0, but at that point it was only a clarification of an existing requirement from the earlier standard.

People like to read a lot more intention into the standard than is reasonable. Lots of it is just historical accident, really.

lmm · 2024-12-11T13:37:30 1733924250

Back when they wrote it they were trying to accommodate existing compilers, including those who did useful things to help people catch errors in their programs (e.g. making memcpy trap and send a signal if you called it with NULL). The current generation of compilers that use undefined behaviour as an excuse to do horrible things that screw over regular programmers but increase performance on microbenchmarks postdates the standard.

wat10000 · 2024-12-11T13:29:39 1733923779

The original C standard was more descriptive than prescriptive. There was probably an implementation where it crashed or misbehaved.

FartyMcFarter · 2024-12-11T13:54:23 1733925263

Because the benefit was probably seen as very little, and the cost significant.

When you're writing a compiler for an architecture where every byte counts you don't make it write extra code for little benefit.

Programmers were routinely counting bytes (both in code size and data) when writing Assembly code back then, and I mean that literally. Some of that carried into higher-level languages, and rightly so.

hyperman1 · 2024-12-11T13:23:30 1733923410

memcpy used to be a rep movsb on 8086 DOS compilers. I don't remember if rep movsb stops if cx=0 on entry, or decrements first and wraps around, copying 64K of data.

dfox · 2024-12-11T14:59:07 1733929147

The specification does not explicitly say that, but the clear intention is that REP with CX=0 should be no-op (you get exactly that situation when REP gets interrupted during the last iteration, in that case CX is zero and IP points to the REP, not the following instruction).

bonzini · 2024-12-11T21:26:22 1733952382

Rep movsb copies 64K if CX=0 (that's actually very useful), but memcpy could be implemented as two instructions:

    jcxz skip 
    rep movsb
    skip:

connicpu · 2024-12-11T14:23:01 1733926981

I know at least MSVC's memcpy on x86_64 still results in a rep movsb if the cpuid flag that says rep movsb is fast is set, which it should be on all x86 chips from about 2011/2012 and onward ;)

frabert · 2024-12-11T12:48:54 1733921334

Every time they leave something undefined, they do so to leave implementations free to use the underlying platform's default behavior, and to allow compilers to use it as an optimization point

lucozade · 2024-12-11T13:39:12 1733924352

> time they leave something undefined, they do so to leave implementations free to use the underlying platform's default behavior

That's implementation defined (more or less) ie teh compiler can do whatever makes mst sense for its implementation.

Undefined means (more or less) that the compiler can assume the behaviour never happens so can apply transforms without taking it into account.

> to allow compilers to use it as an optimization point

That's the main advantage of undefined behaviour ie if you can ignore the usage, you may be able to apply optimisations that you couldn't if you had to take it into account. In the article, for example, GCC eliminated what it considered dead code for a NULL check of a variable that couldn't be NULL according to the C spec.

That's also probably the most frustrating thing about optimisations based on undefined behaviour ie checks that prevent undefined behaviour are removed because the compiler thinks that the check can't ever succeed because, if it did, there must have been undefined behaviour. But the way the developer was ensuring defined behaviour was with the check!

frabert · 2024-12-11T14:03:51 1733925831

AFAIK, something having undefined behavior in the spec does not prevent an implementation- (platform-)specific behavior being defined.

As to your point about checks being erased, that generally happens when the checks happen too late (according to the compiler), or in a wrong way. For example, checking that `src` is not NULL _after_ memcpy(sec, dst, 0) is called. Or, checking for overflow by doing `if(x+y<0) ...` when x and y are nonnegative signed ints.

jcelerier · 2024-12-11T13:13:36 1733922816

Here it's more that it allows to assume that this is never the case, thus no need to have an additional check in it I assume ?

nephanth · 2024-12-11T12:53:20 1733921600

I mean, they might not have given thought to that particular corner case, they probably wrote something like

> memcpy(void* ptr1, void* ptr2, int n)

Copy n bytes from ptr1 to ptr2. UNDEFINED if ptr1 is NULL or ptr2 is NULL

‐------

It might also have come from a "explicit better than implicit" opinion, as in "it is better to have developers explicitly handle cases where the null pointer is involved

jbverschoor · 2024-12-11T13:36:58 1733924218

I think it's more a strategy. C was not created to be safe. It's pretty much a tiny wrapper around assembler. Every limitation requires extra cycles, compile time or runtime, both of which were scarce.

Of course, someone needs to check in the layers of abstraction. The user, programmer, compiler, cpu, architecture.. They chose for the programmer, who like to call themselves "engineers" these days.

poincaredisk · 2024-12-11T17:19:29 1733937569

I disagree with your premise. C was designed to be a high level (for its time) language, abstracted from actual hardware

>It's pretty much a tiny wrapper around assembler

Assebler has zero problem with adding "null + 4" or computing "null-null". C does, because it's not actually a tiny wrapper.

jbverschoor · 2024-12-12T03:11:52 1733973112

Not high-level.. Portable. Portable layer above assembler/arch.

NULL doesn't exist in assembler, and in C, NULL is only a defined as a macro. It's not something built-in.

C doesn't have any problems adding 4 to NULL nor subtracting NULL from NULL.

teo_zero · 2024-12-12T05:54:46 1733982886

> C doesn't have any problems adding 4 to NULL nor subtracting NULL from NULL.

"Having problems" is not a fair description of what's at stake here. The C standard simply says that it doesn't guarantee that such operations give the results that you expect.

Also please note that the article and this whole thread is about the address zero, not about the number zero. If NULL is #defined as 0 in your implementation and you use it in an expression only involving integers, of course no UB is triggered.

jbverschoor · 2024-12-12T03:22:02 1733973722

  #include <stddef.h>
  int main() {
    if (NULL + 4) {}
    if (NULL - NULL) {}
    return 0;
  }

wruza · 2024-12-11T15:03:42 1733929422

Not sure what your last remark means wrt everything else.

captainmuon · 2024-12-11T19:53:20 1733946800

I feel strongly they should split undefined behavior in behavior that is not defined, and things that the compiler is allowed to assume. The former basically already exists as "implementation defined behavior". The latter should be written out explicitly in the documentation:

> memcpy(dest, src, count)

> Copies count bytes from src to dest. [...] Note this is not a plain function, but a special form that applies the constraints dest != NULL and src != NULL to the surrounding scope. Equivalent to:

    assume(dest != NULL)
    assume(src != NULL)
    actual_memcpy(dest, src, count)

The conflation of both concepts breaks the mental model of many programmers, especially ones who learned C/C++ in the 90s where it was common to write very different code, with all kinds of now illegal things like type punning and checking this != NULL.

I'd love to have a flag "-fno-surprizing-ub" or "-fhighlevel-assembler" combined with the above `assume` function or some other syntax to let me help the compiler, so that I can write C like in the 90s - close to metal but with less surprizes.

Thorrez · 2024-12-11T20:34:49 1733949289

>Note this is not a plain function, but a special form that applies the constraints dest != NULL and src != NULL to the surrounding scope.

Plain functions can apply constraints to the surrounding code:

https://godbolt.org/z/fP58WGz9f

tialaramex · 2024-12-12T00:08:01 1733962081

> I'd love to have a flag "-fno-surprizing-ub" or "-fhighlevel-assembler" combined with the above `assume` function or some other syntax to let me help the compiler, so that I can write C like in the 90s - close to metal but with less surprizes.

The problem, which you may realise with some more introspection is that "surprising" is actually a property of you, not of the compiler, so you're asking for mind-reading and that's not one of the options. You want not to experience surprise.

You can of course still get 1990s compilers and you're welcome to them. I cannot promise you won't still feel surprised despite your compiler nostalgia, but I can pretty much guarantee that the 1990s compiler results in slower and buggier software, so that's nice, remember only to charge 1990s rates for the work.

jancsika · 2024-12-11T15:50:04 1733932204

I get that for the library. But I'm a bit puzzled about the optimizations done by a compiler based on this behavior.

E.g., suppose we patch GCC to preserve any conditional containing the string 'NULL' in it. Would that have a measurable performance impact on Linux/Chromium/Firefox?

xbar · 2024-12-11T14:27:53 1733927273

Upon which some people may rely...

int_19h · 2024-12-11T20:50:35 1733950235

People will only rely on UB when it is well defined by a particular implementation, either explicitly or because of a long history of past use. E.g. using unions for type punning in gcc, or allowing methods to be called on null pointers in MSVC.

But there's nothing like that here.

pjmlp · 2024-12-12T07:29:19 1733988559

Until a compiler version comes out and since it was UB anyway, the compiler sundenly now behaves in a different way.

bluetomcat · 2024-12-11T12:55:17 1733921717

A trivial implementation wouldn't dereference dest or src in case the length is 0. That's how a student would write it with a for loop (byte-by-byte copy). A non-trivial implementation might do something with the pointers before entering the copy loop.

pkhuong · 2024-12-11T12:43:17 1733920997

It does nothing, but is only defined when the pointers point into or one past the end of valid objects (live allocations), because that's how the standard defines the C VM, in terms of objects, not a flat byte array.

whytevuhuni · 2024-12-11T12:50:29 1733921429

What if the objects are non-NULL, but invalid (not actually allocated)?

For example, Rust will use address 1 with length 0 for static empty strings, because 1 is a properly aligned non-null pointer.

I would imagine such strings end up being passed to C code sometimes, which may end up calling memcpy with a length of 0 on them.

creshal · 2024-12-11T13:04:07 1733922247

> What if the objects are non-NULL, but invalid (not actually allocated)?

Still UB, since they're restricted pointers that must be valid to begin with.

bonzini · 2024-12-11T21:31:34 1733952694

This is wrong. If you do p=malloc(256), p+256 is valid even though it does not point to a valid address (it might be in an unmapped page; check out ElectricFence). Rust's non-null aligned other pointer is the same, memcpy can't assume it can be dereferenced if the size is zero. The standard text in the linked paper says the same.

pkhuong · 2024-12-11T13:01:27 1733922087

also UB according to the spec, but LLVM is free to define it. e.g., clang often converts trivial C++ copy constructors to memcpy, which is UB for self-assignment, but I assume that's fine because the C++ front-end only targets LLVM, and LLVM presumably defines the behaviour to do what you'd expect.

whytevuhuni · 2024-12-11T13:05:35 1733922335

Where I work, it is quite normal to link together C code compiled with GCC and Rust code compiled with LLVM, due to how the build system is set up.

As far as I know that disables LTO, but the build system is so complex, and the C code so large, that nobody bothers switching the C side to Clang/LLVM as well.

badmintonbaseba · 2024-12-11T14:14:41 1733926481

Still technically UB according to the proposed wording. The proposed wording only deals with allowing null pointers explicitly.

ryao · 2024-12-11T20:58:01 1733950681

I have asked this question in the past and was told that memcpy() is allowed to preemptively read before it has determined it needs to write to make it faster on some CPUs. The presumption is that if you are going to be copying data, there is at least one cache line there already, so reading can start early.

rcxdude · 2024-12-11T12:37:55 1733920675

Purely mechanically, yes, but in terms of the definition of the behaviour in the C abstract machine, no, because certain operations on null pointers are undefined, even if the obvious low-level compilation turns into nothing.

codedokode · 2024-12-11T13:10:06 1733922606

Maybe we should get rid of "abstract machine" and treat pointers as memory addresses?

NobodyNada · 2024-12-11T15:32:36 1733931156

If you do this, your C code will run significantly slower than, say, Java, Go, or C#, because the compiler is unable to apply even the most basic optimizations (which it can do still in all those other languages).

So, at that point why even use C at all? Today, C is used where the overhead of a managed language is unacceptable. If you could just eat the performance cost, you'd probably already be using a managed language. There's not much desire for a variant of C with what would be at least a 10x slowdown in many workloads.

cv5005 · 2024-12-11T18:04:31 1733940271

Or it could be made faster because certain manual optimizations become possible.

An example would a table of interned strings that you wanna match against (say you're writing a parser). Since standard C says thou shall not compare pointers with < or > unless they both point into the same 'object' you are forbidden from doing the speed of light code:

  char *keywords_begin, *keywords_end;
  if(some_str >= keywords_begin && some_str < keywords_end) ...

Official standard sanctioned workarounds would require extra indirection (using indices for example) which is suboptimal.

gpderetta · 2024-12-11T21:15:18 1733951718

You can cast them to uintptr_t and compare them to your heart's desire.

gpderetta · 2024-12-11T13:19:23 1733923163

  int* oracle();
  int foo() {
      int x = 1;
      *oracle() = 42;
      return x;
  }

Is the above program allowed to return anything other than 1 in your language?

kibwen · 2024-12-11T13:49:14 1733924954

To elaborate, we treat pointers as more than just integers because it gives optimizers the latitude to reorder and eliminate pointer operations. In the example above we cannot do this, because we cannot prove at compile time that x doesn't live at the address returned by oracle.

For some high-quality further discussion, see Ralf Jung's series of blog posts starting with https://www.ralfj.de/blog/2018/07/24/pointers-and-bytes.html

shultays · 2024-12-11T14:22:14 1733926934

  However, given how low-level a language C++ is, we can actually break this assumption by setting i to y-x. Since &x[i] is the same as x+i, this means we are actually writing 23 to &y[0].

But that is undefined, you can't do x + (y - x) ie a pointer arithmetic that ends outside of bounds of an array. Since it is undefined, shouldn't C++ assume that changing x[..] can't change y[0]

edit: welp, if I read a few more lines into article I would see that it also tells it is undefined

gpderetta · 2024-12-11T14:24:29 1733927069

to be clear, in my example the result of oracle() cannot possibly alias with 'x' in C or C++ (and in fact gcc will optimize accordingly). In a different language where addresses are mere integers, things would be more complicated.

codedokode · 2024-12-11T14:40:06 1733928006

The result of oracle can point to anything if you write it as return (int *)rand();

Note that rand() returns 32-bit value so you have to call it twice and merge the results to obtain a 64-bit pointer.

gpderetta · 2024-12-11T14:47:12 1733928432

The numerical value returned by oracle might physically match the address of the stack slot for 'x', assuming that it exists, but it doesn't mean that, from a language point of view, it is a valid pointer.

If forging pointers had defined behaviour, it would be impossible to use the language sanely or perform any kind of optimization.

shultays · 2024-12-11T14:18:52 1733926732

Is it allowed to return anything else in C? Is there anything in standard C that would allow oracle() to access memory address of x?

Sure different compilers might allow inlining assembly or some other ways to access x on previous stack perhaps but then it is not really "C"

wat10000 · 2024-12-11T14:48:23 1733928503

That’s the point. C allows this function to be optimized to always return 1. A “pointers are addresses, just emit reads and writes and stop trying to be so clever” version of C would require x to be spilled to the stack, then the write, then reload x and return whatever it contained.

cv5005 · 2024-12-11T17:28:26 1733938106

Then use the register keyword or just reword the standard to assume the register behavior if a variables address hasn't been taken.

The majority of useful optimizations can be kept in a "Sane C" with either code style changes (cache stuff in local vars to avoid aliasing for example) or with minor tweaks to the standard.

wat10000 · 2024-12-11T17:36:53 1733938613

Register behavior is what you want essentially all of the time. So we’d just have to write `register` all over the place for no gain.

“Don’t optimize this, read and write it even if you think it’s not necessary” is a very rare case so it shouldn’t be the default. If you want it, use the volatile keyword.

There’s no need to reword the standard to assume the register behavior if the variable’s address hasn’t been taken. That’s already how it works. In this example, if you escape the value of `&x`, it’s not legal to optimize this function to always return 1.

codedokode · 2024-12-11T14:38:17 1733927897

When using C, this can return anything (or crash of oracle function returns an invalid pointer, or rewrite its own code if the code section is writable). So if you get rid of "abstract machine", nothing changes - the program can return anything or crash.

atq2119 · 2024-12-11T22:58:33 1733957913

The point is that the C standard does guarantee that the function returns 1 if the program is a valid C program - which means there is no UB.

For example: If the oracle function returns an invalid pointer, then dereferencing that pointer is UB, and therefore the program isn't a valid C program.

wat10000 · 2024-12-11T14:45:40 1733928340

A conforming C compiler is allowed to emit that function to perform the write and then return the constant 1. Should that be allowed?

alerighi · 2024-12-11T14:04:08 1733925848

Well even in C is not guaranteed to return anything other than 1, since oracle() may return the memory address of variable 1.

gpderetta · 2024-12-11T14:28:46 1733927326

the literal 1 is not an object in C or C++ hence it does not have an address. If you meant 'x', then also no, oracle() can't return the address of 'x' because of pointer provenance rules.

layer8 · 2024-12-11T15:51:35 1733932295

That would restrict C to memory models with a linear address space. That is usually the case nowadays for C implementations, but maybe we don’t want to set that in stone, because it would be virtually impossible to revert such a guarantee.

There’s also cases like memory address ranges that map to non-memory hardware (i.e. that don’t behave like “dumb” memory), and how would you have the C standard define behavior for those?

Lastly, CPU caches require some sort of abstract model as soon as you have multi-threading.

Measter · 2024-12-11T16:09:24 1733933364

The value of an abstract machine is that it allows you to specify how a given program behaves without needing to point to a specific piece of hardware. Compilers then have this as a target when compiling a program for a specific piece of hardware so that they know when the compiler's output is correct.

The issue here is that the abstract machine is under or badly specified.

sixfiveotwo · 2024-12-11T13:34:39 1733924079

How would you define what a memory address is without first defining in which context it has a meaning?

codedokode · 2024-12-11T14:45:10 1733928310

C was written as a portable assembly language, so I think a memory address is a number that CPU considers to be a memory address.

layer8 · 2024-12-11T15:55:43 1733932543

That’s currently the case in C, in that you can convert pointers to and from uintptr_t. However, not every number representable in that type needs to be valid memory (that’s true on the assembly level as well), hence it’s only defined for valid pointers.

sixfiveotwo · 2024-12-11T15:55:51 1733932551

> I think a memory address is a number that CPU considers to be a memory address

I meant to say that, indeed, there must be some concept of CPU for a memory address to have a meaning, and for this concept of CPU to be as widely applicable as possible, surely defining it as abstract as possible is the way to go. Ergo, the idea of a C abstract machine.

Anyway, other people in this thread are discussing the matter more accurately and in more details than I could hope to do, so I'll leave it like that.

davidt84 · 2024-12-11T13:14:13 1733922853

Congratulations, you've invented an entirely new language.

Now, who's going to write the compiler for it?

anticensor · 2024-12-11T17:14:01 1733937241

No, it's C at -O0.

davidt84 · 2024-12-11T18:18:37 1733941117

No, it's not.

Undefined behaviour is undefined behaviour whatever optimisation level you use.

Some -f flags may extend the C standard and remove undefined behaviour in some cases (e.g. strict aliasing, signed integer overflow, writable string constants, etc.)

lmm · 2024-12-11T13:41:00 1733924460

20 years ago, making a C compiler that provided sane behaviour and better guarantees (going beyond the minimum defined in the standard) to make code safer and programmers' lives easier, even at the cost of some performance, might have been a good idea. Today any programmer who thinks things like not having security bugs are more important than having bigger numbers on microbenchmarks has already moved on from C.

uecker · 2024-12-11T17:20:27 1733937627

This is certainly not true. Many programmers also learned to the use tools available to write reasonably safe code in C. I do not personally find this problematic.

lmm · 2024-12-12T01:15:55 1733966155

> Many programmers also learned to the use tools available to write reasonably safe code in C.

And then someone compiled their code with a new compiler and got a security bug. This happens consistently. Every C programmer thinks their code is reasonably safe until someone finds a security bug in it. Many still think so afterwards.

uecker · 2024-12-12T02:21:11 1733970071

There are couple of cases where compiler optimizations caused security issues, but that this happens all the time is a huge exaggeration. And many of the practically relevant cases can be avoided by using tools such as UBSan. The actual practical issue in C is people getting their pointer arithmetic wrong, which can also be avoided by having safe abstractions for buffer and string handling.

The other fallacy is that these issue then suddenly would disappear when using Rust, which is also not the case. Because the programmer cutting corners in C or prioritizing performance over safety will also use Rust "unsafe" carelessly.

Rust has a clear advantage for temporal memory safety. But it is also possible to have a clear strategy about what data structure owns what other object in C.

lmm · 2024-12-12T04:44:41 1733978681

> And many of the practically relevant cases can be avoided by using tools such as UBSan.

"can be", but aren't.

> The other fallacy is that these issue then suddenly would disappear when using Rust, which is also not the case. Because the programmer cutting corners in C or prioritizing performance over safety will also use Rust "unsafe" carelessly.

The vast majority of these programmers aren't making a deliberate choice at all though. They pick C because they heard it's fast, they write it in the way that the language nudges them towards, or the way that they see done in libraries and examples, and they end up with unsafe code. Sure, someone can deliberately choose unsafe in Rust, but defaults matter.

> it is also possible to have a clear strategy about what data structure owns what other object in C.

Is it though? How can one distinguish a codebase that does from a codebase that doesn't? Other than the expensive static analysis tool mentioned elsewhere in the thread (at which point you're not really writing "C"), I've never seen a way that worked and was distinguishable from the ways that don't work.

uecker · 2024-12-12T18:58:13 1734029893

> > And many of the practically relevant cases can be avoided by using tools such as UBSan.

> "can be", but aren't.

It is a possible option when one needs improved safety, and IMHO often the better option than using Rust.

> > The other fallacy is that these issue then suddenly would disappear when using Rust, which is also not the case. Because the programmer cutting corners in C or prioritizing performance over safety will also use Rust "unsafe" carelessly.

> The vast majority of these programmers aren't making a deliberate choice at all > though. They pick C because they heard it's fast, they write it in the way that the > language nudges them towards, or the way that they see done in libraries and examples, > and they end up with unsafe code. Sure, someone can deliberately choose unsafe in > Rust, but defaults matter.

The choice of handcoding some low-level string manipulation is similar to the choice of using unsafe rust. One can do it or not. There is certainly a better security culture in Rust at this time, but it is unclear to what extend this will be true in the long run. Also C security culture improves too and Rust culture will certainly deteriorate when usage spreads from highly motivated early adopters to the masses.

> > it is also possible to have a clear strategy about what data structure owns what other object in C.

> Is it though? How can one distinguish a codebase that does from a > codebase that doesn't?

This leads to the argument that it is trivial to see unsafe code in Rust because it is marked "unsafe" and just a small amount of code while in C you would need to look at everything. But this largely a theoretical argument: In practice you need to do some quality control for all code anyway, because memory safety is just a small piece of overall the puzzle. (and even for memory safety, you also need to look at the code surrounding code in RUst.) In practice, it is not hard to recognize the C code which is dangerous, it is the one where pointer arithmetic and string manipulation is not encapsulated in safe interfaces and it is the code where ownership of pointers is not clear.

>Other than the expensive static analysis tool mentioned elsewhere in the thread (at which point you're not really writing "C"), I've never seen a way that worked and was distinguishable from the ways that don't work.

I see some very high quality C code with barely any memory safety problems. Expensive static analysis can be used when no mistakes are acceptable, but then you should also formally verify the unsafe code in Rust.

lmm · 2024-12-13T03:57:28 1734062248

> The choice of handcoding some low-level string manipulation is similar to the choice of using unsafe rust. One can do it or not.

But most of the time programmers don't make a conscious choice at all. So opt-out unsafety versus opt-in unsafety is a huge difference.

> In practice you need to do some quality control for all code anyway, because memory safety is just a small piece of overall the puzzle.

Memory safety is literally more than half of real-world security issues.

> In practice, it is not hard to recognize the C code which is dangerous

> I see some very high quality C code with barely any memory safety problems

I hear a lot of C people saying this sort of thing, but they never make it concrete - there's no list of which popular open-source libraries are dangerous and which are not, it's only after a vulnerability is discovered that we hear "oh, that project always had poor quality code". If I pick a random library to maybe use in my project (even big-name ones e.g. libpq or libtiff), no-one can ever actually answer whether that's high quality C code or low quality C code, or give me a simple algorithm that I can actually apply without having to read a load of code and make a subjective judgement. Whereas I don't have to read or judge anything or even properly know rust to do "how much of this rust code is unsafe".

uecker · 2024-12-14T21:23:03 1734211383

> > The choice of handcoding some low-level string manipulation is similar to the choice of using unsafe rust. One can do it or not.

> But most of the time programmers don't make a conscious choice at all. So opt-out unsafety versus opt-in unsafety is a huge difference.

I don't think so. A programmer being careless will be careless with Rust "unsafe" too.

Don't get me wrong, I think marking code without guaranteed memory safety is a good idea. I just don't think it is a fundamental game changer.

> > In practice you need to do some quality control for all code anyway, because memory safety is just a small piece of overall the puzzle.

> Memory safety is literally more than half of real-world security issues.

https://www.horizon3.ai/attack-research/attack-blogs/analysi...

But I think even this is likely overstating it by looking at CVEs and not real world impact.

> > > In practice, it is not hard to recognize the C code which is dangerous

> > I see some very high quality C code with barely any memory safety problems

> I hear a lot of C people saying this sort of thing, but they never make it > concrete - there's no list of which popular open-source libraries are dangerous > and which are not, it's only after a vulnerability is discovered that we hear > "oh, that project always had poor quality code". If I pick a random library > to maybe use in my project (even big-name ones e.g. libpq or libtiff), no-one > can ever actually answer whether that's high quality C code or low quality C code > or give me a simple algorithm that I can actually apply without having to read > a load of code and make a subjective judgement. Whereas I don't have to read or > judge anything or even properly know rust to do "how much of this rust code is unsafe".

So you look at all the 300 unmaintained dependencies a typical Rust projects pulls in via cargo and look at all the "unsafe" blocks to screen it? Seriously, the issue is lack of open-source man power and this will hit Rust very hard once the ecosystem gets larger and this goes even more beyond the highly motivated first adopters. I would be more tempted to buy this argument if Rust would have no "unsafe" and I could pull in arbitrary code from anywhere and be safe. And this idea existed before with managed languages... Safe Java in the browser and so. Also sounded plausible but was similarly highly exaggerated as the Rust story.

lmm · 2024-12-15T01:01:36 1734224496

> A programmer being careless will be careless with Rust "unsafe" too.

Programmers will be careless, sure, but you can't really use unsafe without going out of your way to. Like, no-one is going to write "unsafe { *arr.get_unchecked(index) }" instead of "arr[index]" when they're not thinking about it.

> So you look at all the 300 unmaintained dependencies a typical Rust projects pulls in via cargo and look at all the "unsafe" blocks to screen it?

No, of course not, I run "cargo geiger" and let the computer do it.

I think unmaintained dependencies are less likely, and easier to check, in the Rust world. Ultimately what defines the attack surface is the number of lines of code, not how they're packaged, and C's approach tends to lead to linking in giant do-everything frameworks (e.g. people will link to GLib or APR when they just wanted some string manipulation functions or a hash table, which means you then have to audit the whole framework to audit that program's dependencies. And while the framework might look well-maintained, that doesn't mean that the part your program is using is), reimplementing or copy-pasting common functions because they're not worth adding a dependency for (which is higher risk, and means that well-known bugs can keep reappearing, because there's no central place to fix it once and for all), or both. And C's limited dependency management means that people often resort to vendoring, so even if your dependency is being maintained, those bugfixes may not be making their way into your program.

> And this idea existed before with managed languages... Safe Java in the browser and so. Also sounded plausible but was similarly highly exaggerated as the Rust story.

Java has quietly worked. It didn't succeed in the browser or on the open-source or consumer-facing desktop for reasons that had nothing to do with safety (in some cases they had to do with the perception of safety), but backend processing or corporate internal apps are a lot safer than they used to be, without really having to change much.

quotemstr · 2024-12-11T18:29:12 1733941752

> safe code in C

You're like a Japanese holdout in the 60s refusing to leave his bunker long after the war is over.

C lost. Memory safety is a huge boon for security. Human beings, even the best of them, cannot consistently write correct C code. (Look at OpenBSD.) You can keep fighting the war your side has already lost or you can move on.

uecker · 2024-12-11T19:17:15 1733944635

Well, memory safety is great but it seems Rust programmers also manage to create memory safety issues just fine:

https://rustsec.org/advisories/RUSTSEC-2024-0401.html https://rustsec.org/advisories/RUSTSEC-2024-0400.html https://rustsec.org/advisories/RUSTSEC-2024-0377.html https://rustsec.org/advisories/RUSTSEC-2024-0374.html etc.

whytevuhuni · 2024-12-11T20:05:51 1733947551

I think the first one, stack overflow, is technically not a memory safety issue, just denial-of-service on resource exhaustion. Stack overflow is well defined as far as I know.

The other three are definitely memory safety issues.

ryao · 2024-12-11T21:25:24 1733952324

I would consider a stack overflow to be a memory safety issue. The C++ language authors likely would too. C++ famously refused to support variable length stack allocated arrays because of memory safety concerns. In specific, they were worried that code at runtime would make an array so big so big that it would jump the OS guard page, allowing access to unallocated memory that of course is not noticed ahead of time during development. This is probably easy to do unintentionally if you have more stack variables after an enormous stack allocated array and touch them before you touch the array. The alternative is to force developers to use compiler extensions such as alloca(). That makes it easy to pass pointers outside of the stack frame where they are valid and is a definite safety issue. The C++ nitpicking over variable length arrays is silly since it gives us a status quo where C++ developers use alloca() anyway, but it shows that stack overflows are considered a memory safety issue.

whytevuhuni · 2024-12-11T21:55:05 1733954105

In the general case, I think you might be right, although it's a bit mitigated by the fact that Rust does not have support for variable length arrays, alloca, or anything that uses them, in the standard library. As you said though, it's certainly possible.

I was more referring to that specific linked advisory, which is unlikely to use either VLAs or alloca. In that case, where stack overflow would be caused by recursion, a guard frame will always be enough to catch it, and will result in a safe abort [0].

[0] https://github.com/rust-lang/rust/pull/31333

ryao · 2024-12-12T01:30:38 1733967038

I cited the complaints against VLAs as support for stack overflows being a memory safety issue. I did not mean to imply that Rust supported them.

quotemstr · 2024-12-11T20:06:54 1733947614

C++ is a better unsafe language than unsafe Rust, IMHO. The thing about the social dynamic of Rust, though, is that it keeps unsafe code to a minimum.

uecker · 2024-12-12T02:40:55 1733971255

This may be true, but the minimum unsafe code still seems not that small. Maybe I just had bad luck, but one of the first things I looked at more closely was an implementation of a matrix transpose in Rust (as an example of something relevant to my problem domain) and that directly used unsafe Rust to be reasonably fast and then already had a CVE. This was a revealing experience because was just the same type of bug you might have had in similar C code, but in a language where countless people insist that this "can not happen".

uecker · 2024-12-12T02:04:49 1733969089

I agree that one shouldn't have been included. My favorite ones aren't included here anyway, e.g. how a Rust programmer managed to create a safety issue in a matrix transpose or how the messed up str::repeat in their standard library.

And don't get me wrong. I think Rust is as safer language in C. Just the idea that C is completely unsafe and it is impossible even for experts to write reasonable safe code while it is completely impossible in Rust to create an issue is just a lot of nonsense. In reality, it is possible to screw up in both languages and people do this, and reality is that safety in Rust is only somewhat better when compared to C with good security practices. But this is not how it is presented. I also think the difference will become even smaller when C safety continues to improve as it did in the last years due to better tooling while Rust is being picked up by average programmers under time pressure who will use "unsafe" just as carelessly as they carelessly hand-roll pointer arithmetic in C today.

ryao · 2024-12-11T21:02:58 1733950978

Use a sound static analyzer like astree and you can produce memory safe C code:

https://www.absint.com/astree/index.htm

Note that the key word here is sound. The more common static analyzers are unsound tools that will miss cases. Sound tools do not, but few people know of them, they are rare and they are typically proprietary and expensive.

quotemstr · 2024-12-11T22:40:30 1733956830

Sure. I'm also a big fan of what Microsoft has done with SAL. And of course you have formally proven C, as used in seL4. I'd say that the contortions you have to go through to write code with these systems takes you out of the domain of "C" and into a domain of a different, safer language merely resembling C. Such a language might be a fine tool! But it's not arbitrary C.

uecker · 2024-12-12T03:08:45 1733972925

Note that my original comment above was "reasonably safe" and not "perfectly memory safe". You can formally prove something with a lot of effort, but you can also come reasonably close for practical purposes with a lot less effort and more commonly available tools.

You are right that "arbitrary C" is not safe while safe Rust is safe, but this is mostly begging the question. The question is what can you do with the language depending on your requirements. If you need safe C this doable with enough effort, if you need reasonably safe C this is even practical in most projects, and all this should be compared to Rust as used in a similar situation which very well may include use of unsafe Rust or C libraries which may also limit the safety.

ryao · 2024-12-12T06:00:36 1733983236

It is C. It is just written with the aid of formal methods. It would be nice if all software were written that way. That said, if you want another language “resembling C”, there is always WUFFS:

https://github.com/google/wuffs

The output of the WUFFS compiler certainly resembles C because it is C.

lmm · 2024-12-15T01:05:08 1734224708

> It is C. It is just written with the aid of formal methods.

It is not C in the sense that many of the usual reasons to use C no longer apply. E.g. a common reason to use C is the availability of libraries, but most popular libraries will not pass that analyser so you can't use them if you're depending on that analyser. E.g. a common reason to use C is standard tooling for e.g. automated refactoring, but will those standard tools preserve analyser-passing? Probably not.

IcePic · 2024-12-11T12:42:31 1733920951

"man bcopy" on BSD:

'If len is zero, no bytes are copied.'

Seems reasonable.

crest · 2024-12-12T00:26:37 1733963197

As I understand that doesn't imply that it's not undefined to pass NULL pointers. While not what most users expect/want it's possible to this is just a wrapper around an memcpy() which will only be correct to call with valid destination and source pointers even if the length is zero.

ryukoposting · 2024-12-11T15:07:50 1733929670

Yes and no.

No, because ISO never said it must behave this way.

Yes, because every libc I've personally encountered acts this way. At a glance, glibc's x86 implementation[1, 2], musl, and picolibc all handle 0-length memcpy as you'd expect. I'm sure other folks could dig up the code for Newlib, uclibc, and others, and they'd see the same thing.

On a related note, ISO C has THREE different things that most people tend to lump together as "undefined behavior." They are:

Implementation-defined behavior: ISO doesn't require any particular behavior, but they do require implementations to consistently apply a particular behavior, and document that behavior.

Unspecified behavior: ISO doesn't require any particular behavior, but they do require implementations to consistently use a particular behavior, but they don't require that behavior to be documented.

Undefined behavior: ISO doesn't require any particular behavior, and they don't require implementations to define any particular behavior either.

[1]: https://github.com/lattera/glibc/blob/master/string/memcpy.c [2]: https://github.com/lattera/glibc/blob/895ef79e04a953cac14938...