Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
V8 performance can beat C sometimes (debian.org)
46 points by pmelendez on Oct 11, 2013 | hide | past | favorite | 35 comments


This isn't V8 beating C; this is one regex library being faster than another (both written in C or C++).


No, V8's regex library is not written in C. V8 compiles its regular expressions to machine code at runtime. The availability of the compiler at runtime can be a liability for things like startup time or memory use, but can also be a key advantage in situations like regular expression parsing or packet processing.


You need a regex library to turn a regex into machine code. What the parent means is that the V8 regex library compiles them, while the one used in the C code merely interprets the regex.

(The C and C++ thing is just adding confusion here)


Well it's not written in JavaScript, so as the OP said it's either C or C++.


The code that's doing the actual matching is not written in C or C++. It's an internal regex bytecode compiled directly to machine code.


Regular expressions are part of the JavaScript language, so they're as much "written in JavaScript" as any other language construct.

The point is that a regular expression compiled by V8 (which is not written in JavaScript) is faster than a regular expression as compiled by some particular C compiler. Of course a C compiler could possibly be written to understand calls to a standard regexp API and compile those to binary.


That's true.

But it does show how good V8 is at optimizing out the JS part of JS when it's irrelevant to the computation, and it shows how well-optimized the regexps in V8 are.

None of this is terribly surprising given V8's use case, but only a few years back this would have been a ridiculous result. I enjoyed taking a minute to reflect on that.


This is actually a JIT compiled regex beating an interpreted one. And it's beating it by a far smaller margin than I would have expected, honestly.


As far as I know Irregexp[1] compiles on V8 byte code so you can't use it isolated without V8.

Please, let me know if you find the opposite.

[1] http://blog.chromium.org/2009/02/irregexp-google-chromes-new...


> Irregexp[1] compiles on V8 byte code

I was thinking: but V8 doesn't have byte code.

Then opened the link:

"After optimization we generate native machine code which uses backtracking to try different alternatives."


It's not JavaScript bytecode, but regex bytecode. V8 parses regular expressions to regex ASTs, takes them through some more intermediate forms, then finally to a special regex bytecode. That bytecode is either interpreted, or JIT compiled (depending of some build options, I think). You can look at a list of bytecodes in [1], and at the x64 JIT in [2].

1 - http://code.google.com/p/v8/source/browse/trunk/src/bytecode...

2 - http://code.google.com/p/v8/source/browse/trunk/src/x64/rege...


Well I might got it wrong but I got the bytecode reference from here:

http://v8.googlecode.com/svn/trunk/src/interpreter-irregexp....

Skimming the code it does look like they are interpreting byte code from Irregexp but I would need to read further to say it for sure.


Reading through the responses to this comment, the upshot w.r.t regex performance is that it's a significant advantage to have native regex support in a language. Because C doesn't the regex ultimately has to be translated into C at parse time, whereas with Javascript the regex is converted to machinecode at run time.

This is similar to the advantage of abstraction layers in general, although there's a cost if you create abstraction layers for things no-one much cares about.


The code for the C implementation is incredibly verbose compared to the code for the JS implementation.

JS: http://benchmarksgame.alioth.debian.org/u64/program.php?test...

C: http://benchmarksgame.alioth.debian.org/u64/program.php?test...


That's just ... absurd, in comparison at least. It uses TCL and glib?!

I'll go out on a limb here and claim that this C program is not the shortest possible equivalent to the JS version.



No this is not V8 performance beating C.

This is V8 calling a regexp library written in C++. C just happens to be using a different library that is slower.


>"This is V8 calling a regexp library written in C++"

But it also beat the C++ implementation though.


I guess that re2 (the library that this C++ code is using) is slower than irregexp (the library that V8 is uses) on this benchmark.


Rather than argue back and forth about which regex implementation is better here, I'd simply note the absence of multicore numbers for V8 and that the crappy Tcl library beats single-threaded Javascript when run multicore.

It would be interesting, however, to compare the performance here to that of the "grep" command on the same expressions. I'd be surprised at anything short of a total smackdown of Javascript V8.

I would expect an even bigger smackdown if one invoked GPUs:

http://bkase.github.io/CUDA-grep/finalreport.html

Even more interesting would be an OpenCL implementation that did exactly what V8 is doing to the regexes (Why? Because it can).

And total aside, this is why I don't participate in TopCoder competitions - they never let you have any real fun.


The idea was obviously to pick a test and environment that would make it look like V8 can beat something other than interpreted JS. Otherwise it would've been the complete package: string manipulation, numerical computation, multi-core etc.


If I read this benchmark right: v8 regexps are faster than TCL regexps.

Hardly surprising.


I also beats all the other regexps engines though.


Likely because V8 only works on platforms where JIT is possible, so the regex engine relies on JITting specialized native code for the regex. A portable C/C++ regex library can't really do that.


The V8 version used nearly 50% more memory than the C version. It is well known that you can often get faster execution times at the cost of increased memory usage.


More like 30%


Anyone else read this as "VB" as in, "Visual Basic"? Now that I would bookmark :)


Yes, I got the same impression.


why is golang so much slower compared to Java ? Is it because the language is immature or because the language libraries are immature ?


Go not only uses a different library for processing regular expressions, but a different algorithm altogether. It is slower in some cases than the standard PCRE implementations but faster in others.

http://code.google.com/p/re2/



Quickly, everyone convert their C to javascript!


So ruby java lua and python are all about the same?


Congralutations! :-)

(And the program is IMHO clearer than the C implementation)


To the V8 team and the implementer of the testing program.

Disclaimer: I had anything to do with this, I just thought it was interesting enough to post it.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: