Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

[deleted]


Just read the email.


Actually it's still a useful question, I think.

A simple example is function argument evaluation order. If you have:

foo(bar(), baz())

A compiler is free to call these functions in either order. That means if baz relies on state mutated by bar, the program may behave differently if the compiler chooses to reorder evaluation.


There's a parable where a man goes to the doctor and says, "Doctor, whenever I drink my coffee with the spoon in the cup, the spoon handle pokes me in the eye and it hurts." And the doctor says, "Well, stop doing that."

If you wrote `foo(bar(), baz())` and `baz()` relies on state mutated by `bar()`, your code is bad, and you should feel bad, because experiencing those bad feelings is the way you learn to not write bad code. This code was wrong before the compiler reordered the calls, it just failed silently for a while. The compiler isn't responsible for fixing your bugs, you are.

People need to stop expecting other people to fix their problems.


Uhh... The question was "what's an example of undefined behaviour". I gave one, specifically of an example that could realistically break by relying on undefined behaviour.

That's it. Simple education.

In the context of this post I think that's a good thing because not everyone will understand the topic.

You, however, seen to have read some sort of agenda in the question, which I find a little baffling...


You're right; I'm sorry for misreading your intention. I've edited my post so it's not as directed at you.


But that's not "undefined behavior" either. The post misleads people.

Certain things are unspecified. It can call in any order for example.

Other things are undefined. If you do them your program is no longer valid at all, and can crash or corrupt.


To reiterate/emphasize this point: "undefined", "unspecified", and a few other related terms are Things in C. They have specific, non-interchangeable, well-defined meanings in the C specifications.


The eye-spoon defence of C undefined and unspecified behaviours is very unfortunate and misleading: it makes it sound like avoiding them is as easy as just taking a spoon out of a cup.

Theoretically "stop doing that" works... but history has shown that it really doesn't work in practice, in C, in programming more broadly, and, really, in any human endeavour ("planes don't need safety procedures, just stop making mistakes").


Who is at fault in your example becomes much less clear cut when you consider the variant where the author of foo doesn't have access to the source code of bar and baz and they only rely on shared state on some systems or in some corner cases.


If bar() returned the mutated state, he could just do:

    foo(baz(bar()))


I think a better design would be to separate mutators from pure functions. If a procedure mutates state, it should have a void return type, and if a procedure returns a value it should be a pure function that doesn't mutate state.

This is, of course, a rule of thumb, not a hard law. Some exceptions:

1. I think it's okay (and in fact, idiomatic in C) to mutate state and return some sort of information about what occurred (i.e. a success flag, a number of characters written, etc.).

2. Isolated mutations such as logging sometimes make sense in an otherwise pure function.


I'm a fan of fluent design myself, and dislike flags.

    return foo().baz().bar();
With flags you start with success/fail, and end up with HRESULT.


I'd agree with you in some languages, but in C this would be prohibitively difficult. In general, good fluent design uses immutable objects, which makes it basically just a syntactic sugar for functional programming. While fluent syntax is nice, the functional semantics are the real value, and are much easier to do in C (although, as soon as you add in memory management, functional programming often becomes prohibitively difficult too).


A perhaps more interesting, quite realistic example:

    lock_two_widgets(widget *a, widget *b) {
        if (a < b) {
            lock(&a->lock);
            lock(&b->lock);
        } else {
            lock(&b->lock);
            lock(&a->lock);
        }
    }


If lock calls include a memory barrier, which they should, then they cannot be reordered.

Edit: Your code has undefined behavior, unless the two pointers point to the same object, which is unlikely in a realistic example.


Yes, the pointer comparison is UB but if the compiler didn't go out of its way to screw you over UB code then it would be a reasonable way to prevent deadlock.


There is a defined method for comparing such pointers. Take unsigned char pointers to them and inspect their bytes. Then write your code, using that information, that does the comparison, assuming you know your architecture.

It is defined behavior to read the bytes of any object using unsigned char.


Alternatively, if the implementation supports uintptr_t, you can convert to that and then compare the respective integral values.


Right, because you can get to the end of a non-void function without producing a value. The pointer comparison isn't necessarily UB.


That is called unspecified behavior not undefined behavior. Major difference.


Not many know (and less people care), funnily :)

I have yet to see a thorough (source-to-compiler-intent-to-assembly) comparative description of undefined, unspecified and implementation-defined behaviors, though (not just a somewhat insightful blog, or techno gospel, which the C and C++ standards are).


You could always write one. I recommend the work of John Regehr as a starting reference: http://blog.regehr.org/


I wish I had time to learn this first :)

But yes, authoring a book titled "Well-defined C (and C++)" one day would be awesome.

Thanks for the link.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: