Hacker News new | past | comments | ask | show | jobs | submit login

Checking the length of an array on each iteration of a loop is an amateur mistake, in any language. Some of these points were pretty nitpicky, but I thought this was a valid complaint.



In compiled languages it's usually not worth breaking convention to factor it out, because the compiler will apply loop-invariant code motion and pull it out of the loop body anyway. The exception is if you're calling a function that the compiler can't prove to be referentially transparent, eg. strlen.


Really interesting, thanks. I don't spend a lot of time with compiled languages these days.

Question: how can the compiler be sure an array's length wouldn't have changed? I suppose it could examine the code within the loop, but then the same could be said for strlen. What's the difference?


Most languages don't have functionality to resize an array in place. If you realloc in C, you get back a pointer that may or may not be the same as what you passed in. If you 'new' in Java, you must assign that pointer to your initial variable. If the loop body never assigns to the array, the compiler can be sure that it still points to the same block of memory, and its length will be the same.

This is a specific example of dataflow analysis, which is a fascinating area of compiler design. Usually you'll use something like a use-def chain to figure out all the points where a variable might've been defined. If all of those are outside of the loop, then the computation itself may be moved outside of the loop too, and the loop just refers to the temporary that's the result. I'd encourage you to pick up a book on compiler design if you're interested; here's the Wikipedia article:

http://en.wikipedia.org/wiki/Use-define_chain

This explicitly doesn't apply to "smart" containers like Vectors, which automatically resize themselves. If you were to inline these methods, you'd see that the internal buffer pointer may change inside the loop (assuming that you call functions that mutate the object; if you don't, and the language has a notion of const-correctness, a sufficiently smart compiler could probably optimize them too), and so the compiler can't move any code outside. Same with strlen; its return value depends upon the contents of the memory buffer pointed to, so unless the compiler can prove that that memory buffer never changes (hard; any function may reach in through a pointer and set a character to null), it can't optimize the call away.


Brilliant, thank you for the explanation and the Wikipedia link. I'm going to stop by the library this weekend to see if they have any books on compiler design, do you have any recommendations there?

Thanks again, its great to learn something new.


The classic is the "Dragon Book":

http://www.amazon.com/Compilers-Principles-Techniques-Alfred...

I'm also partial to Appel's "Modern Compiler Implementation in ML", because it goes into more detail about techniques for compiling functional languages:

http://www.amazon.com/Modern-Compiler-Implementation-Andrew-...


Great, thanks. I'll check these out.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: