Write any javascript code with just these characters: ()[]{}+

dherman · on Aug 10, 2012

Arg, scooped! I was working on this exact same thing! :D

Since you've beat me to it, let me offer up a couple additional tricks you might want to use. If you want to make this completely independent of browser API's, you can eliminate the dependence on window.location (or atob/btoa as the sla.ckers.org poster did).

Trick #1 is to get the letter "S".

You can extract this from the source code of the String constructor, but you want to be careful to make this as portable as possible. The ES spec doesn't mandate much about the results of Function.prototype.toString, although it "suggests" that it should be in the form of a FunctionDeclaration. In practice you can count on it starting with [whitespace] "function" [whitespace] [function name]. So how to eliminate the whitespace?

For this, we can make use of JS's broken isNaN global function, which coerces its argument to a number before doing its test. It just so happens that whitespace coerces to NaN, whereas alphabetical characters coerce to 0. So isNaN is just the predicate we need to strip out the whitespace characters. So we can reliably get the string "S" from:

[].slice.call(String+"").filter(isNaN)[8]

Of course, to get isNaN you need the Function("return isNaN")() trick, and you know how the rest of the encoding works.

Trick #2 then lets you get any lowercase letter, in particular "p".

For this, we can make use of the fact that toString on a number allows you to pick a radix other than 2, 8, 10, or 16. Again, the ES spec doesn't mandate this, but in practice it's widely implemented, and the spec does say that if you implement it its behavior needs to be the proper generalization of the other radices. So we can get things like:

(25).toString(26) // "p"

(17).toString(18) // "h"

(22).toString(23) // "m"

and other hard-to-achieve letters.

But once you've got "p", you're home free with escape and unescape, as you said in your post.

Dave

alcuadrado · on Aug 10, 2012

Great idea Dave, I considered using String+"" but didn't know how standard it was, so I discarded it.

The slice & isNaN trick is brilliant!

dherman · on Aug 10, 2012

PS I flipped the logic in my explanation; whitespace coerces to 0 and letters coerce to NaN. Which is why filter removes the whitespace and not the letters.

mathias · on Aug 10, 2012

Now that you mention it — V8 has an interesting bug with excessive Number#toString() decimal digits (http://code.google.com/p/v8/issues/detail?id=1627).

For example:

    (1.1536999999997645e-10).toString(33).match(/[a-z]+/g)[81]; // 'oops'

More here: https://gist.github.com/1153826

dherman · on Aug 10, 2012

Funky! Luckily that bug doesn't interfere with this trick, since it's only relying on pretty-printing integers.

Dave

sequoia · on Aug 22, 2012

Thanks for this great comment & thanks OP!

Is there some reason not to use 36 as a radix and access the whole lowercase alphabet like

    (10).toString(36) // "a"
    ...
    (36).toString(36) // "z"

? I'm curious why you use varied combinations of radixes & base numbers.

EDIT: Friend pointed out that you are only extending the number set out to what's required for that one character. Makes sense now. :)

mistercow · on Aug 10, 2012

I think even though that's not guaranteed by the standard, it's a lot more portable in principle than relying on the DOM.

CurtHagenlocher · on Aug 10, 2012

This is like a bizarro-world lambda calculus, complete with its own Church numerals.

dag11 · on Aug 10, 2012

I made a little script to extract the original javascript from a script obfuscated with OP's tool (http://patriciopalladino.com/files/hieroglyphy/).

And because I felt it was appropriate, I created this extraction script in an obfuscated form!

Use this to extract obfuscated scripts: http://pastebin.com/raw.php?i=Q9TB4wEF

Just save your obfuscated script in a variable called "original" and then run my code. It'll return with the extracted script.

Oh, and it won't work on itself. That's because I didn't use the obfuscation tool to create it. I made it mostly by hand: http://pastebin.com/9LBWCSJs

quarterto · on Aug 10, 2012

There are no words to describe how dirty this makes me feel.

apendleton · on Aug 10, 2012

This post title omits "!" which is also necessary.

alcuadrado · on Aug 10, 2012

My fault, not intended

stcredzero · on Aug 11, 2012

So, basically Javascript is just a superset of an esolang that contains itself.

http://esolangs.org/wiki/Main_Page

(Especially true if you're developing with a Javascript interpreter hosted in Javascript. Really, it's esolangs all the way down.)

maartenscholl · on Aug 10, 2012

If you like reducing programs to basic expressions you should read into SKI combinator calculus and the X combinator. Here is a paper that describes the construction of an efficient X combinator[1]. Reading the paper gave me insight in how simple yet powerful combinatory logic is.

[1]www.staff.science.uu.nl/~fokke101/article/combinat/combinat.ps

bgeron · on Aug 10, 2012

I evalled all pieces of Javascript of <30 characters in Rhino, takes 1 minute on my laptop. 4219 possible values, after stripping out some really uninteresting stuff. Doesn't seem to contain anything interesting, unfortunately.

http://pastebin.com/CM5ac6Xi

_ugfj · on Aug 10, 2012

I am not sure about those results. I entered (+[][{}]+{})[+[]] into Chrome console and got N (from NaN[Object object]) while your code lists it as u. If you replace the first +[] with a 0 you get an u (from undefined...). Interesting.

jerfelix · on Aug 11, 2012

Looks cool, but I couldn't make it work.

I went to http://patriciopalladino.com/files/hieroglyphy/ and put in a script "alert(1);". This provided me with a script of about 8300 characters.

I created a web page to execute the script:

    <body onload="
    [][(![]+[])[!+[] ...
    </body>

Firebug reports:

    ReferenceError:  Unescaee is not defined.

Looks like it's having trouble picking up a "p".

CUViper · on Aug 11, 2012

Did you try to do this locally? The article explains that the "p" is picked up from window.location, assuming it's http or https. If you're using "file://...", that third character index is 'e' instead.

jerfelix · on Aug 11, 2012

Good catch! You diagnosed my eroblem eerfectly.

sophiebits · on Aug 10, 2012

The article lists [][+[]] for undefined; you can get away with just [][[]].

infinity · on Aug 11, 2012

Some of you may also enjoy aaencode by Yosuke Hasegawa:

http://utf-8.jp/public/aaencode.html

Encode any JavaScript program to Japanese style emoticons (^_^)

And of course jjencode:

http://utf-8.jp/public/jjencode.html

(hint: have a look at "palindrome")

skrebbel · on Aug 11, 2012

Apparently, he also did the OP's trick: http://utf-8.jp/public/jsfuck.html (but without, {} even)

dherman · on Aug 11, 2012

Unfortunately, his tricks no longer work in current JS engines; it relies on using

  [].sort.call()

which I believe used to return the global object but now throws an exception.

AFAICT, you need to add {} to make this work in current JS engines.

Dave

mistercow · on Aug 10, 2012

Man, if you didn't care about performance or bandwidth, this would be a hell an of obfuscation technique.

alcuadrado · on Aug 10, 2012

This is pretty easy to reverse. Most JS parsers can print the source code of functions, so you can do that for the generated lambdas.

mistercow · on Aug 10, 2012

Yes of course. And even if they couldn't, it would be trivial to fork an existing JS implementation and make eval spit out its input.

mseebach · on Aug 11, 2012

That's not neccesary, I've got right-click disabled on my website.

erichocean · on Aug 13, 2012

Okay, that made me laugh. :)

simcop2387 · on Aug 10, 2012

Performance might not be too bad actually. My understanding is that he's building up a string with the code you run normally and then evaling it, so the performance might not be bad, aside from the start-up cost. Bandwidth... I don't want to speculate on that one :)

mistercow · on Aug 10, 2012

Actually, I did a small test and found that after gzip, the file size only expands by about 10x. Running both input and output through bz2, the obfuscated file only comes out 3x larger. If you were very protective of your code, and you had enough of it to justify loading up a bz2 decoder on the client side, you could actually make that economical bandwidth-wise.

That said, this was a very small test; the original file was a random snippet of JS code less than 500 bytes, and that itself took a considerable amount for hieroglyphy to chew on, so I can't really do a proper test of a larger input file.

dherman · on Aug 10, 2012

I think if you wanted to make this really robust, you could use more techniques to beef it up.

One technique would be to store verbose or commonly-used string constants in accessible locations like Array.prototype.f. Then you could access, say, the string "prototype" by simply writing

    [][(![]+[])[+[]]]

Once you build up a little scratch storage of the most common or hard-to-encode strings, everything starts getting orders of magnitude smaller.

(Technically, this means that you're polluting the space shared with the program being encoded, so for everything to work the program can't make use of it. But that's a pretty simple invariant to ask of the input program: "don't get or set the 'f' property of arrays.")

Another technique would be to break up large statements into smaller substatements, to avoid fixed limits of JS engines on statement size. You can always avoid semicolons, since ASI is guaranteed to work if you start your statement with a !.

Dave

shurane · on Aug 10, 2012

But think of the reduced complexity (in terms of characters) this language is! This makes me think of the small amount of primitives needed to make a LISP machine. Or Brainfuck. Or Unlambda.

gmrple · on Aug 10, 2012

Eh, I can restrict my self to ones and zeros for even more reduced language complexity and have amazing performance if I'm clever.

simonster · on Aug 10, 2012

Bandwidth is probably not too bad. It seems like it would be quite amenable to gzip.

mistercow · on Aug 10, 2012

I mentioned in another reply that it actually does gzip fairly well (and bzips even better). But you still end up with an order of magnitude expansion with gzip (as compared to gzipping the original source), and around 3x expansion with bz2 (again compared to bz2 on the original source).

That's from a <0.5KB test input, so the expansion might be mitigated a little more for larger files. I was going to test on a 3KB microlibrary, but gave up after about 10 minutes of waiting for the conversion to finish.

ctdonath · on Aug 10, 2012

Cross this with John Horton Conway's notion of "Surreal Numbers" and you might be onto something.

alter8 · on Aug 10, 2012

This guy did it with 6 characters by removing {}. But it lacks the detailed description available in this post.

EDIT: I didn't check properly. You only use {} for a minor detail.

http://utf-8.jp/public/jsfuck.html

cseax · on Aug 10, 2012

Is it just me, or does recursing his example break chrome?

skrebbel · on Aug 11, 2012

Could someone please enlighten me as to how this helps doing an XSS attack?

TazeTSchnitzel · on Aug 11, 2012

Some sites "filter" user input instead of escaping it.

jared314 · on Aug 11, 2012

I remember something like this a few years ago. They were using it for XSS. http://news.ycombinator.com/item?id=1153383

rubyrescue · on Aug 11, 2012

this is very cool...let me know if you want a job at inaka (we're in BA and have other people in school working for us)

chris_wot · on Aug 11, 2012

I wonder how well gzip would compress this?

michaelmior · on Aug 11, 2012

Example: 47,734 bytes to 813. ~98%

chris_wot · on Aug 11, 2012

Pretty damn well! I wonder what the decompression and parsing time is...

michokest · on Aug 11, 2012

Minor typo:

> "[object Object]" with {}+[]

I believe it should be []+{}

bazookaBen · on Aug 11, 2012

i pasted the entire json library into the field and it just hung. Any tips?

bradsmithinc · on Aug 11, 2012

Witchcraft

Fando · on Aug 11, 2012

really cool

mynameishere · on Aug 10, 2012

Write any Windows application with just the following characters: 0 1

ethereal · on Aug 10, 2012

... you'd need some preprocessing first. The ASCII characters `0' and `1' aren't easy to use to write a program, though you could do it with nasm and some '-' and '+'s I suppose.

If someone can prove me wrong, I'd be very happy though. Writing a program using just '0' and '1' (the ASCII characters) would be awesome. (in an established programming language, and no homomorphisms. :) )