SitePoint swallowed my comment, so I'll just post it here:
I found some of your claims questionable, so I cherry-picked one to test for myself.
This code converts arguments[i] to a string object using the
String conversion function. This is possibly the slowest way to
perform such a conversion, although it would be the most
obvious to many developers coming from other languages.
Much quicker is to add an empty string ("") to the value you
wish to convert:
I compared the time it took to run 100,000 of each of String(fn) and fn + "", and found that the results were more ambiguous and less definitive than your assertion. In fact, fn + "" was 6.6% faster on Firefox 3.5, but 15.8% slower on a recent Chromium build. In any case, the difference was, even over 100,000 runs, a few milliseconds, so it hardly seems like this particular case would be an area to focus on for optimization.
I wonder how many of your other strong claims would stand up to similar testing.
I've done a lot of testing on creating strings by concatenation and it turns out that + is actually one of the slowest. It's much faster, for example, to create an array and then join it with "". Maybe try that in your test and see how it compares.
Example:
x = []
x.push("latortuga")
x.push(" is so right")
str = x.join("")
Some Sitepoint Article: How not to Critique a Javascript Library
for (var i = fromIndex; i < arr.length; i++) {
This for loop looks up the .length property of the array (arr)
each time through the loop. Simply by setting a variable to
store this number at the start of the loop, you can make the
loop run much faster:
From six lines above, in array.js:
if (arr.indexOf) {
return arr.indexOf(obj, opt_fromIndex);
}
if (Array.indexOf) {
return Array.indexOf(arr, obj, opt_fromIndex);
}
It only falls through to there if Array already has no indexOf prototype method on it at all. In that case, performance on "for" loop is the least of your concerns, because you're using a terribly slow browser where it won't matter anyway. They used a cached length property later on in the very same file, in places where it might actually matter.
Google’s developers seem to have figured this trick out
later on in the same file. From array.js, line 153:
var l = arr.length; // must be fixed during loop... see docs
...
for (var i = l - 1; i >= 0; --i) {
This loop is better in that it avoids a property lookup each
time through the loop, but this particular for loop is so
simple that it could be further simplified into a while loop,
which will run much faster again:
var i = arr.length;
...
while (i--) {
On V8 and SquirrelFish Extreme, these two loops are exactly the same speed (and indeed, probably generate near-identical native code.) On FireFox and older Chrome/WebKit, the average difference on an array of 10,000 elements is something like the difference between 0.068ms and 0.072 ms to iterate (I don't remember the actual figure on my machine, it's negligible.) In V8, Google's own native iterators are implemented in JavaScript using that incrementing 'for' loop.
But not all of Closure Library’s performance woes are due
to poorly optimized loops. From dom.js, line 797:
switch (node.tagName) {
case goog.dom.TagName.APPLET:
case goog.dom.TagName.AREA:
case goog.dom.TagName.BR:
case goog.dom.TagName.COL:
case goog.dom.TagName.FRAME:
case goog.dom.TagName.HR:
case goog.dom.TagName.IMG:
case goog.dom.TagName.INPUT:
case goog.dom.TagName.IFRAME:
case goog.dom.TagName.ISINDEX:
case goog.dom.TagName.LINK:
case goog.dom.TagName.NOFRAMES:
case goog.dom.TagName.NOSCRIPT:
case goog.dom.TagName.META:
case goog.dom.TagName.OBJECT:
case goog.dom.TagName.PARAM:
case goog.dom.TagName.SCRIPT:
case goog.dom.TagName.STYLE:
return false;
}
return true;
This kind of code is actually pretty common in Java, and
will perform just fine there. In JavaScript, however,
this switch statement will perform like a dog each and
every time a developer checks if a particular HTML element
is allowed to have children.
Experienced JavaScript developers know that it’s much
quicker to create an object to encapsulate this logic:
var takesChildren = {}
takesChildren[goog.dom.TagName.APPLET] = 1;
takesChildren[goog.dom.TagName.AREA] = 1;
...
This code can be further bulletproofed against outside
interference using hasOwnProperty (see below for a full
explanation of this).
return !takesChildren.hasOwnProperty(node.tagName);
Really? You tested it? Because a 'switch' does not necessarily generate the same code as a dynamically constructed set of object properties. In Google's original example, if none of the cases match, it immediately returns 'true'. In Sitepoint's example, if the property does not match on the highest level, it must crawl the property chain on the object until it eventually reaches 'undefined'. Not to mention you are making two inner function calls and a fully dynamic property lookup (!(), hasOwnProperty, and node.tagName), none of which I believe can be calculated statically.
Actually, I'm going to end this rant early, because I just read more of the article, and it looks like the author just keeps pulling random shit that he remembers reading from "Javascript: The Good Parts" or whatever and vomiting it back up onto his keyboard.
Agreed, very poorly written article. You can't approach optimisation in isolation like that. There's a whole range of factors like readability, situational awareness, concurrency etc... that need to be considered.
Plus this is the first release of what has previously been an internal-only library. It needs a little time to mature. jQuery wasn't perfect of the gate either.
it must crawl the property chain on the object until it eventually reaches 'undefined'
If it were a normal property lookup, yes, but not with `hasOwnProperty' - that's the same as any old hash table lookup. So efficiency here really depends on the implementation, and Google's is probably faster. Regardless, we're really splitting hairs - these optimizations will get a few milliseconds, on average. As much as I am a perfectionist, it's all completely negligible in terms of UX.
I'm talking about the class tracing V8 does, not prototype chains. "Property chain" meaning in V8, it must climb the series of what it calls "hidden classes" in order to find all of the properties. Other engines do various things when you pass a totally dynamic property lookup in. Some slower engines probably use naïve hash tables, I don't know.
Feel free to correct me if I'm wrong on this one, though. I've never really ripped open V8 or any of the other engines to look inside, I just go by what I read from their design docs.
From my reading of the V8 docs, the hidden class climbing is done when properties are added to the class. Each added property causes a transition to the next hidden class. The final hidden class has all the properties as offsets and a failed lookup might chain through prototypes (not sure there) but does not work though the other hidden classes.
It also seems like the author also forgot that code brevity is very important when maintaining such a large code base. Everyone can understand that switch, but the suggested object replacement is not so obvious.
Might some of these little idiomatic optimizations be automatically performed by the Closure Compiler, making their absence from the original source irrelevant?
Perhaps, but it does seem that the Google Closure developers are unaware of these optimizations. Unless the compiler was written by an entirely different group, I would be surprised.
Closure was written by many, many different people. I can attest that the core developers (the ones whose names are on the project) are very aware of these optimizations - one of them interviewed me when I was being hired, and another was in charge of my JS readability review. But Closure's a big chunk of code, though, and the core developers tend to be very busy people, and not everyone can review everything.
Well, I learned something new from most of the points mentioned.
Only thing I would take exception to is:
>Those 1.0s are telling. Languages like Java represent integers (1) differently from floating point numbers (1.0). In JavaScript, however, numbers are numbers. (1 - factor) would have worked just as well.
It works just as well but the code seems to be clearer (without any loss of functionality or performance) for explicitly making it a '1.0'
Yes this point was particularly ridiculous. Yes, this is a convention from Java (and C++), but I actually do this in my JS code as well. Every minifier will easily fix this for you, and in my opinion it makes things clearer. In fact, there exists an argument that this form of clarity is more important in a language like JavaScript precisely because you don't have a type system (in other words, seeing a 1.0 when an operation should clearly be dealing with ints can flag incorrect logic).
I agree, but his example is strong. While calculating something every time the mouse moves would be expensive and memoization is probably a good idea, the mouse pointer in most situations has some locality, and so some LRU cache is probably a better approach. There's probably examples where you'll get more bang for your buck using a different cache storage as well.
Does that mean in general their memoization function is flawed because it doesn't support that, no. If you need such specialization, for optimization, you're free to roll your own--which is why I agreed with you.
I don't know what definition of memory leak the author is working with, but a real memory leak occurs when all references dynamically allocated chunk of memory are lost without free()ing the memory. An unbound cache does not fit this definition.
The term "memory leak" is also sometimes applied, particularly in GC environments, to mean "memory that is still referenced, but will never be used again".
Poorly-thought-out caching is an easy way to do that, especially if the cached values themselves have a chain of references to other values.
With caching I don’t think you can assume any values are sure to never be used again.
Yes, there are some situations where circular references to DOM elements and the like will cause reference count based GC engines to leak memory. But an unbounded cache does not fit that description. It could be poor memory usage but not a memory leak.
The section "The Slow Loop" posits that many for-loops in the library are much slower than they could be because they "[look] up the .length property of the array ... each time through the loop".
I wonder what the performance difference is in terms of referencing a local var vs. referencing an object property? Surely it's negligible?
I wondered that too. I wonder if they just don't notice it because the modern VMs they mainly use would cache or inline it, but older implementations do a search for the property each time? (I'm speculating on both those points)
I think it's an oversight. The coding conventions I work with ban that form of a for-loop, precisely because the .length access can potentially be very costly. (I'm in a different part of the company, one that doesn't use Closure, but our latency constraints are even tighter than theirs are.) If you're iterating over a NodeList and not just an array, it can actually turn an O(N) loop into an O(N^2) one.
Checking the length of an array on each iteration of a loop is an amateur mistake, in any language. Some of these points were pretty nitpicky, but I thought this was a valid complaint.
In compiled languages it's usually not worth breaking convention to factor it out, because the compiler will apply loop-invariant code motion and pull it out of the loop body anyway. The exception is if you're calling a function that the compiler can't prove to be referentially transparent, eg. strlen.
Really interesting, thanks. I don't spend a lot of time with compiled languages these days.
Question: how can the compiler be sure an array's length wouldn't have changed? I suppose it could examine the code within the loop, but then the same could be said for strlen. What's the difference?
Most languages don't have functionality to resize an array in place. If you realloc in C, you get back a pointer that may or may not be the same as what you passed in. If you 'new' in Java, you must assign that pointer to your initial variable. If the loop body never assigns to the array, the compiler can be sure that it still points to the same block of memory, and its length will be the same.
This is a specific example of dataflow analysis, which is a fascinating area of compiler design. Usually you'll use something like a use-def chain to figure out all the points where a variable might've been defined. If all of those are outside of the loop, then the computation itself may be moved outside of the loop too, and the loop just refers to the temporary that's the result. I'd encourage you to pick up a book on compiler design if you're interested; here's the Wikipedia article:
This explicitly doesn't apply to "smart" containers like Vectors, which automatically resize themselves. If you were to inline these methods, you'd see that the internal buffer pointer may change inside the loop (assuming that you call functions that mutate the object; if you don't, and the language has a notion of const-correctness, a sufficiently smart compiler could probably optimize them too), and so the compiler can't move any code outside. Same with strlen; its return value depends upon the contents of the memory buffer pointed to, so unless the compiler can prove that that memory buffer never changes (hard; any function may reach in through a pointer and set a character to null), it can't optimize the call away.
Brilliant, thank you for the explanation and the Wikipedia link. I'm going to stop by the library this weekend to see if they have any books on compiler design, do you have any recommendations there?
I'm also partial to Appel's "Modern Compiler Implementation in ML", because it goes into more detail about techniques for compiling functional languages:
With 10M entries in the array, the "slow" way takes 509 ms and the "faster" way takes 362 ms. So we can save roughly 15 nanoseconds per iteration. Or a 30% speedup - assuming nothing happens inside the loop.
(Of course the numbers may vary wildly per browser.)
Somehow I doubt this is the performance problem behind slow web pages.
In a case like this, it doesn't really matter how much of a performance benefit you get because it's ridiculously trivial to inmplement. This in particular is one of those optimizations that I remember being told back in CS101.
A commenter at sitepoint noted the positive point about Google open sourcing Closure. I think it's a good point. It leaves room for people to review, critique, and help improve things.
I thought the more interesting thing about Closure was the optimizer (and I wonder if the optimizer offsets any of the issues described in this critique.)
Regarding the library, I think the interface is more important at this point than the implementation. The latter can be refactored, while the former will always be there.
For reference for anyone else who doesn't know how this should be done correctly (I had to go and look at the dojo source).
dojo.isString = function(/*anything*/ it){
// summary:
// Return true if it is a String
return (typeof it == "string" || it instanceof String); // Boolean
}
I write a lot of JavaScript and I'm not even sure this is preferable. The "object wrappers" like String are considered deprecated in some circles as a matter of style, e.g. by Crockford. From this point of view, a stricter check is cleaner. And contrary to what the article implies, String(foo) is a cast to a primitive string, and doesn't create an object like 'new String' does.
So the Google function correctly returns whether String(foo) is a no-op.
This function checks if a particular variable has a value defined. Or it does, unless a 3rd party script sets the global undefined variable to something else. This single line of code anywhere in the page will bring Closure Library crashing down:
var undefined = 5;
Relying on the global undefined variable is another rookie mistake for JavaScript library authors.
Granted, yes I think libraries in particular should code defensively and I originally saw the var undefined; trick in jQuery some time ago, but realistically, I think they are making a mountain of a mole hill. In my mind it's more of a rookie mistake to trample the undefined type in your code than it is to assume the undefined type has been trampled. I'm no elite js interpreter wizard but in fact I believe the other reason why libraries use the var undefined; trick is to gain a slight speed boost.
Some code is so bad that it's hard to have sympathy for the people who wrote it when stuff breaks. Overriding undefined is one of those things.
If we're getting that nitpicky, having an isDef method at all is adding an unnecessary function call that will slow down your program. And while you certainly could create an undefined variable for checking against each time, I'd hardly say not doing so is a "rookie mistake".
Blogger complaints that there's permanent lack of some optimizations in the code because authors have chosen an intuitive way of expressing things in the language. The problem is maybe not the authors, but the fact that efficient way of programming in JS is usually counter-intuitive.
Wow, sometime reading around HN is a little scary! The sheer amount of expertise that the community can bring to bear on nearly any topic is pretty crazy. Even more-so with something like this that impinges on a core subject for many of the people here...
I'd say it isn't. As you said, most of the problems are minor. Some of the performance assumptions are false. Very little of this seems to be of any consequence.
Agreed. I was expecting a critique of the library API. JavaScript APIs (including, especially, the browser APIs) have a long history of being overly-verbose, and designed as if it was Java.
Has anyone used the closure library API enough to comment?
I found some of your claims questionable, so I cherry-picked one to test for myself.
I compared the time it took to run 100,000 of each of String(fn) and fn + "", and found that the results were more ambiguous and less definitive than your assertion. In fact, fn + "" was 6.6% faster on Firefox 3.5, but 15.8% slower on a recent Chromium build. In any case, the difference was, even over 100,000 runs, a few milliseconds, so it hardly seems like this particular case would be an area to focus on for optimization.I wonder how many of your other strong claims would stand up to similar testing.
Test code and results: http://gist.github.com/232944