> nobody at Microsoft ever understood how the code worked (much less still u...

robomartin · on Dec 19, 2012

Assembler makes the point even more apparent:

    MSG_LOOP_PostMessage:
        mov     a, MSG_Head    
        cjne    a, #MSG_BUFFER_END - 2, mlpm0
        mov     a, #MSG_BUFFER
        sjmp    mlpm1
    mlpm0:
        inc     a
        inc     a
    mlpm1:
        cjne    a, MSG_Tail, mlpm2
        clr     a
        ret
    mlpm2:
        mov     r0, MSG_Head
        mov     @r0, MSG_Code
        inc     r0
        mov     @r0, MSG_Parameter
        mov     MSG_Head, a
        mov     a, #1
        ret

I wrote that about fifteen years ago. Of course, the routine's label tells you something: "MSG_LOOP_PostMessage". It must post a message to a message buffer. The rest is giberish. What's the intent behind each and every block of code? Well, of course, that's not how I wrote it. This is what I wrote:

    ; POST MESSAGE --------------------------------------------------------------------------
    ; This routine posts a new message to the message loop buffer.
    ;
    ; The message code and parameter are written to the buffer only if buffer space is
    ; available.  This is determined by first looking at the difference between the
    ; head and tail pointers.  By design, we sacrifice one set of locations (two 
    ; bytes) in the buffer in order to create a gap between an advancing head pointer and
    ; a lagging tail pointer.  If a lot of messages are issued and not processed immediately
    ; the head pointer will quickly wrap around and threaten to collide with the tail
    ; pointer.  By sacrificing a set of locations we avoid having to keep a counter of
    ; unprocessed messages.  This, because there would be ambiguity when both head and
    ; tail pointers point to the same location: it could mean that there are no messages
    ; to process or that there's a buffer full of messages to process.  The two byte
    ; gap we are imposing between head and tail removes this ambiguity and makes it easy
    ; to determine the buffer empty and full conditions.
    ;
    ; If there's space for a new message it is stored and the head pointer advanced to
    ; the next available location.  However, if no space remains, the message is discarded
    ; and the head pointer will remain one message (two bytes) away from the tail pointer.
    ;
    ; Arguments
    ; ----------------------------
    ; MSG_Head                Message buffer head pointer
    ; MSG_Tail                Message buffer tail pointer
    ; MSG_Code                Message code
    ; MSG_Parameter        Message parameter
    ;
    ; Return
    ; ----------------------------
    ; ACC =  0 -> Message did not post
    ; ACC <> 0 -> Message posted
    ;
    ; Modified
    ; ----------------------------
    ; MSG_Head
    ; ACC, R0
    ;
    ; Preserved
    ; ----------------------------
    ; MSG_Tail
    ; MSG_Code
    ; MSG_Parameter
    ;
    MSG_LOOP_PostMessage:
        mov      a, MSG_Head                    ;Increment the head pointer by one message
        cjne     a, #MSG_BUFFER_END - 2, mlpm0  ;and parameter.
        mov      a, #MSG_BUFFER                 ;Need to wrap around.
        sjmp     mlpm1                          ;
    mlpm0:                                      ;
        inc      a                              ;No need to wrap around, just increment.
        inc      a                              ;
    mlpm1:
        cjne     a, MSG_Tail, mlpm2             ;Check for a buffer full condition.
        clr      a                              ;Flag that we did not post the message.
        ret                                     ;Exit if it is.
    mlpm2:
        ;The buffer isn't full, we can store the new message
        mov      r0, MSG_Head                   ;Store message
        mov      @r0, MSG_Code
        inc      r0
        mov      @r0, MSG_Parameter             ;Store parameter
        
        ;Now set the head pointer to the next available message location.
        mov      MSG_Head, a                    ;The accumulator already has the next "head" location
                                                ;there's no need to execute the same logic again.
        mov      a, #1                          ;Flag that the message was posted
        ret                                     ;and exit

Now you don't have to know assembler to understand this code. A couple of years later I had to re-write this in C. It was incredibly easy to do because of the exhaustive in-code documentation. This is what came out:

    // POST MESSAGE --------------------------------------------------------------------------
    // This routine posts a new message to the message loop buffer.
    //
    // The message code and parameter are written to the buffer only if buffer space is
    // available.  This is determined by first looking at the difference between the
    // head and tail pointers.  By design, we sacrifice one set of locations (two 
    // bytes) in the buffer in order to create a gap between an advancing head pointer and
    // a lagging tail pointer.  If a lot of messages are issued and not processed immediately
    // the head pointer will quickly wrap around and threaten to collide with the tail
    // pointer.  By sacrificing a set of locations we avoid having to keep a counter of
    // unprocessed messages.  This, because there would be ambiguity when both head and
    // tail pointers point to the same location: it could mean that there are no messages
    // to process or that there's a buffer full of messages to process.  The two byte
    // gap we are imposing between head and tail removes this ambiguity and makes it easy
    // to determine the buffer empty and full conditions.
    //
    // If there's space for a new message it is stored and the head pointer advanced to
    // the next available location.  However, if no space remains, the message is discarded
    // and the head pointer will remain one message (two bytes) away from the tail pointer.
    //
    //
    // Return
    // ----------------------------
    // 0 = Message did not post
    // 1 = Message posted
    //
    // 
    U8 msg_loop_post_message(U8 message, U16 parameter)
    {
        if(msg_loop.count == MESSAGE_LOOP_SIZE)
        {
            return(0);    // Can't post because buffer is full
        }
        else
        {
            msg_loop.item[msg_loop.head].message = message;      // Post message
            msg_loop.item[msg_loop.head].parameter = parameter;  //
            msg_loop.count++;                                    // Update message count

            if( msg_loop.head == (MESSAGE_LOOP_SIZE - 1))        // Wrap around?
            {
                msg_loop.head = 0;                               // Yes.
            }
            else
            {
                msg_loop.head++;                                 // No
            }
            return(1);
        }
    }

Dylan16807 · on Dec 19, 2012

This code is definitely readable, but I feel like the port from assember to C might have been too direct. Originally head was a pointer and it made sense to have conditional jumps, but in the C code head is an offset starting at 0. Why keep the entire if structure when you can simply have

  msg_loop.head = (msg_loop.head + 1) % MESSAGE_LOOP_SIZE  // Advance head and wrap

robomartin · on Dec 19, 2012

You know, it's been about fifteen years.

I remember I had to convert this to C in a hurry because new features had to be added. Doing it in assembler was just about unthinkable. The entire project consists of nearly 100 assembler files and resulted in about 90 files once translated into C.

This was a marathon run and once the entire project was translated and working the focus was narrowly shifted in the direction of adding functionality rather than optimizing the code.

This code was running on an 8051-derivative. Performance was important. You are always counting clock cycles on these systems. I think that if you look at the resulting machine code with the relevant variables mapped to the data memory space the modulo statement doesn't really do any better than a spelled-out conditional statement. In fact, it could be a couple of clock cycles slower (just guessing).

Again, it's been a long time. I can totally see looking at the generated assembly and saying something like "That's as fast as it it going to get. Move on!".

We all make less-than-elegant code at one point or another. Hopefully not too many times. :-)

http://www.keil.com/support/man/docs/is51/

https://www.silabs.com/Support%20Documents/TechnicalDocs/C80...

winter_blue · on Dec 19, 2012

> the modulo statement doesn't really do any better than a spelled-out conditional statement

Actually, on modern processors the modulo would probably be more efficient than the conditional due to cool stuff they do like pipelining. Pipelining is basically where they execute instructions ahead of time (sort of in pseudo-parallel); and so when there's a branch, there's a fair chance the results from future processing might have to be thrown out.

One of the techniques that's used to go around this is branch prediction[1], where they initially guess whether a branch will be taken or not and then depending on whether the branch was actually taken, they change their guess. This works particularly well for for loops and the kind.

Another development on the ARM side (and MIPS too, I think) are instructions that are embedded with conditionals in them.[2]

[1] http://en.wikipedia.org/wiki/Branch_predictor

[2] http://en.wikipedia.org/wiki/ARM_architecture#Conditional_ex...

robomartin · on Dec 19, 2012

Absolutely true. I was talking about the 8051, which is the context of the code I posted.

yxhuvud · on Dec 19, 2012

>That's ridiculous. However, something like this is useful:

> // Bounds-check velocities

No, that is an example of a comment of a region that should have been a separate method/function.

robomartin · on Dec 19, 2012

Not really.

If you throw methods and functions at everything all you are doing is adding the processing overhead of the entry and exit from the method or function. These are not magical entities. There's a time, a place and cost to using them.

Having come up from assembly and, in general, low level coding, one becomes very aware of what is being created behind the scenes. Unless something like this contrived bounds-check test will be used multiple times across a module or modules there's no reason whatsoever to add the overhead of entering and exiting a function for a simple couple of if/else-if/else statements.

I see this all the time. Everything has to be an object and everything has to be a class with a pile of properties and methods. No it doesn't. Massive projects --critical projects-- have been done over the years without any of that. Be careful not to engage in creating a monument to a coding paradigm rather than a tight, fast, practical and sensible solution to a problem.

yxhuvud · on Dec 19, 2012

Then tell your compiler to inline it - but don't bother doing it until you have measured it to actually have a measureable impact. Readability and rewriteability dominate that kind of unguided microoptimizations you seem to like all the time.

That doesn't mean to abstract for the sake of abstraction but to choose the right abstraction. Writing code so that it is easy to reach the right abstraction helps a lot.