Call for support for Lisp in WebAssembly development

klodolph · on March 12, 2016

From digging, it looks like the issue here (or one of the issues) is that the Web Assembly encoding for some Lisp uses cases (multiple value return) is not very compact. Making the proposed change would presumably reduce the size of Lisp code once compiled to Web Assembly, it would not really affect the behavior of the program once compiled to native code.

I can relate to the Web Assembly team's reluctance to add features which are only really wanted by a small subset of users, especially when they only affect the binary size. These features, if implemented, may suffer from poor test coverage. My own preference is that compact binaries are nice but if you're going to use a high-level language, an increase in binary size is just expected (say, an order of magnitude) unless the encoding/VM and language were designed in concert (Java + JVM, or C# + CIL are two examples). Heck, C++ binaries can be enormous.

Then again, I didn't dig deep enough to really understand the nuances of the argument. Perhaps someone could elaborate.

vmorgulis · on March 12, 2016

The target prevails over the language. The implicit intent is to convert LLVM bitcode.

"Please keep in mind that Wasm is not a user-facing language, it is a compilation target. To justify the extra complexity of this feature for its own sake, you would need to come up with convincing evidence that compilers would significantly benefit from it. I doubt that."

https://github.com/WebAssembly/spec/pull/215#issuecomment-17...

dietrichepp · on March 12, 2016

I like the wording "compilers would significantly benefit from it" because if only SBCL benefits, then that's only "compiler" not "compilers".

PuercoPop · on March 12, 2016

Why are you talking about SBCL here? This is not for the benefit of SBCL, which does not target Web-Assembly, but about implementing multiple return values easily. Yes not many languages have multiple return values today, but Web Assembly should not only focus on implement only the languages that are popular today but the languages of the future.

nemoniac · on March 12, 2016

While I agree wholeheartedly with this point, I have to smile at it being made about one of the oldest programming languages still in use.

In some ways Lisp has always been and continues to be the language of the future.

dietrichepp · on March 12, 2016

From what I understand, it really is just Lisp, not any language with multiple return values. The issue here is Lisp semantics for multiple return values being slightly cumbersome (but by no means difficult).

junke · on March 12, 2016

The discussion on github also mentions Lua (https://github.com/WebAssembly/spec/pull/215#issuecomment-17...), though it is true that this is coming from a Common Lisp developer.

From my point of view the feature is useful (efficient implementation without useless memory allocation) and the rebuttal looks slightly uninformed.

There are discussions about coroutines (https://github.com/WebAssembly/design/blob/master/FutureFeat...) and more generally about low-level capabilities (https://extensiblewebmanifesto.org/). Among those, we can see in the design document (https://github.com/WebAssembly/design/blob/4dec2849ad6d79fae...) a "Multiple Return Values" section with a "TODO" placeholder text: it seems that WebAssembly people want to implement MRV.

thinkpad20 · on March 12, 2016

Go has multiple return values; could this also benefit go? I'd bet that that would garner more calls for MRV support in wasm if so.

_3u10 · on March 12, 2016

This is the (infinite) loop that keeps us with C.

Lets not do X because it would benefit Y which no one uses. No one uses Y because it's slow because no CPU supports X.

Very few people think the future really exists.

CyberDildonics · on March 12, 2016

You think languages are slow because they aren't 'supported' by the CPU?

A dynamically typed language isn't going to start running at the speed of C with some extra CPU engineering.

ihuman · on March 12, 2016

Aren't there other Lisp compilers, besides SBCL? What about Lisps besides Common Lisp, like Scheme?

junke · on March 12, 2016

This is not only about code size, but about efficiency and, more fundamentally, about what multiple values are. Are MRV just syntactic sugar around lists, a second-class mechanism ("tuples don't nest") for when you have zero-or-one return value, or a first-class feature which can benefit from the support of the runtime compiler?

https://github.com/WebAssembly/spec/pull/215

qwertyuiop924 · on March 12, 2016

In lisp, it's a first-class feature. I think that this was a godawful mistake, but many disagree, and here we are.

junke · on March 12, 2016

The last time you discussed about this[0], you attributed to Common Lisp some problems that were present with Scheme's multiple values.

And here, as explained in this thread, Wasm is not a user-facing language but the target of compiler, which means that the usual complaints about the somewhat verbose binding constructs of CL's values (which I find acceptable as a user) are not a problem either, since the code ought to be generated from other languages (which might even not define multiple values themselves, but for which a compiler could generate code that use MVR).

It seems that the only acceptable way to produce multiple values for you is to return a composite value. I think that having a dedicated type to represent multiple values is a useful tool to give to the programmer.

[0] https://news.ycombinator.com/item?id=11240383

qwertyuiop924 · on March 15, 2016

You can think that. I disagree. That's fine. Good on you.

And yes, I did mis-attribute some problems with Scheme's implementation of MVR to common lisp. However, I admitted to the mistake in that thread, and eventually got the idea.

So please, don't confuse the issue. Here, I'm not talking just about lisp. I just don't like MVR in general. And yes, I think that a composite value is the only good way to produce multiple values. But I don't have an objection to WASM adding MVR, because it's compiler facing, as you pointed out. I object to higher lever languages adding it.

kuschku · on March 12, 2016

Then again, if the WebAssembly team makes that argument, we might as well just use JavaScript + ASM.js

rubber_duck · on March 12, 2016

Not really, aside from smaller binary size parsing a binary bytecode vs parsinga a JS pseudo bytecode is a huge difference in load time + they get to optimize it for efficient translation vs trying to hack JS constructs to get desired semantics and then trying to recognize those in the JIT

Asm.js is a hack made with ducktape and chewing gum, wasm is a solution designed/engineered to solve the general problem - not just smaller asm.js

dietrichepp · on March 12, 2016

ASM.js builds can be quite large, even 10s of MB is not uncommon. Reducing the binary size isn't just "nice to have", it changes the viability of the platform.

kuschku · on March 12, 2016

But wasn’t the argument of the wasm team against the LISP features that binary size isn’t relevant and just a "nice to have" feature?

dietrichepp · on March 14, 2016

I don't think anyone is arguing that binary size isn't relevant. It just has to be weighed against the other parameters we want to optimize, like implementation complexity.

deadowl · on March 12, 2016

Any way to reconcile power-of-two memory structure and boundary checks? I can't imagine all code should be constrained to power-of-two memory, but if you throw in multi-threading somehow, I think it would start making more sense to have the best of both worlds.

lisper · on March 12, 2016

> some support from other people in the lisp community seems necessary.

A clearer call to action would be helpful here. What exactly should members of the Lisp community who care about this do?

pjlegato · on March 12, 2016

Agree. I support having Lisp support in WebAssembly! Now what?

vmorgulis · on March 12, 2016

My knowledge of LLVM an SBCL is limited but I know a bit emscripten and how it works. I will look around "multivalues" and "power of two memory access" in LLVM.

zimbatm · on March 12, 2016

Actually getting the support of other language communities would be even more useful. If you can find use-cases of the feature for other languages then it's going to be much more compelling.

vmorgulis · on March 12, 2016

The misunderstandings are related to the use of the word "AST". wasm looks like an AST but in fact it's a bytecode with stackframes.

Sanddancer · on March 12, 2016

The memory allocation feature feels more like it would be part of whatever memory allocation library is used than something that should be baked into the language. For example, jemalloc allows for the kind of alignment that is discussed here and is done at runtime, and doesn't require specific behavior from the lower levels. Any language is going to need a runtime because you can't put in every feature that every user will need, and a malloc doesn't seem like a huge issue, especially with one already written that an implementation can crib from.

nabla9 · on March 12, 2016

Memory allocation feature asked is not a language feature. It has to be baked into web assembly if you want to use portable byte masking with pointers.

It's very low level implementation level detail that enables fast execution of dynamically typed languages. Boxing has high cost and it consumes memory. Type tags embedded in pointers can be very fast. For that to work, you need objects that are aligned with power-of-two byte boundaries.

Adding this feature enables efficient execution strategy for scripting and dynamic languages.

Sanddancer · on March 12, 2016

The primary target of WebAssembly is strongly-typed pre-compiled languages, where the kinds of features you want would just lead to slowdowns and excessive memory consumption. There is no hardware currently out there that is a tagged architecture, so expecting them to bend backwards is not a realistic option.

nabla9 · on March 14, 2016

You don't need tagged architecture if you allocate memory by power-of-two regions.

pmarreck · on March 12, 2016

I support having the Erlang BEAM in WebAssembly! Any takers?

zaro · on March 12, 2016

"This masking strategy would in turn require a power-of-two related memory size, and there has been a lot of resistance to this too."

A more appropriate title will be "WebAssembly team doesn't want to listen my ideas on how WebAssembly should work".

junke · on March 12, 2016

From https://www.w3.org/community/webassembly :

> The mission of this group is to promote early-stage cross-browser collaboration on a new, portable, size- and load-time-efficient format suitable for compilation to the web.

See also https://www.w3.org/community/council/wiki/Templates/CG_Chart....

This is a collaborative work where people can make suggestions (I cannot judge if the proposal was fairly evaluated or not).

> WebAssembly team doesn't want to listen my ideas on how WebAssembly should work.

Your title implies that the WebAssembly team has the best knowledge and/or expertise to develop WebAssembly. They are probably expert in their own domain but are willing to take advice from other contributors.

zaro · on March 15, 2016

Please read this quote again:

"This masking strategy would in turn require a power-of-two related memory size, and there has been a lot of resistance to this too."

And try to think about the implications it has on the memory model of the VM that is going to execute/JIT Webassemly. A power of two memory model, isn't really viable at this level I think. And you don't need to think a lot about it, to figure out that jumping to 256MB memory, just because your app/page needs 130MB is a bit of counter-optimization :)

The resistance is the sensible thing to do in this case :)

finchisko · on March 12, 2016

So what is this webassembly about? Allowing programming for web in any language (java, c, lisp) and compile to webasm, as some kind of runtime env?

qwertyuiop924 · on March 12, 2016

Basically. At least, that's the theory.

_alexander_ · on March 12, 2016

Actually, anybody does not know what is it ;) but many paople think that webassembly will kill JS.

sandra_saltlake · on March 12, 2016

I don't expect it to use the hardware in a sensible manner.

dschiptsov · on March 12, 2016

BTW, language implementations which rely on LLVM for code generation would get it for free. Well, for much less pain.

BTW2, time to appreciate how LLVM's approach is superior to JVM madness/religion (and how Golang's is even more clever - do less by doing it right - essentiaistic/minimalistic ascetism)

dgellow · on March 12, 2016

Could you elaborate on you BTW2?

dschiptsov · on March 12, 2016

Modern CPU+OS is a good-enough hardware VM and a target platform. Process isolation under an OS is a right level of abstraction.

A VM as user-level OS process which tries to do an OS job and reimplement everything inside a VM is simply ridiculous. Javascript follows the same madeness.

Multi-threading for imperative code is a big mistake, which breaks isolation and results in lock-hell, context-switching nightmare and layers of unnecessary complexity which is impossible to reason about.

Golang and Swift guys got it.

xaduha · on March 12, 2016

How would you explain that?

https://www.techempower.com/benchmarks/#section=data-r11&hw=...

Swift is nowhere to be seen and Go is nowhere near the top.

Once it gets going JVM is a beast.

dschiptsov · on March 12, 2016

At cost of wasting of almost as much resources that it serves.

Top is about popularity, not quality. Junk-food is also popular.

My analysis was about the first principles, not abstract ones, but grounded in reality. Those who got the principles right wins in the long run.

Erlang (where VM is not a byte-code interpreter), Golang, Haskell (except when monads are abused by idiots), etc are designs based on the right principles. Java was a primitive religion based on superstitions (the fear of pointers) from the start.

xaduha · on March 12, 2016

> Top is about popularity, not quality. Junk-food is also popular.

What are you even talking about? This is a performance benchmark.

> Java was a primitive religion based on superstitions (the fear of pointers) from the start.

...

dschiptsov · on March 12, 2016

Performance on a simplified task is the least important metric.

BTW, it will be wonderful to see next to these charts "memory used" and "lines of code used, including all dependencies" columns. And "length of stack trace in kilobytes" of course.

Sorry, I didn't read this particular link. I have seen too many of them before. Principles are above particularities.)

Edit: an illustration - closer to real world example chart from the same site:

https://www.techempower.com/benchmarks/#section=data-r12&hw=...

xaduha · on March 12, 2016

Whatever you say, chief.

dschiptsov · on March 12, 2016

Thank you!

Let me illustrate the thesis about necessity of proper abstractions and principles grounded in reality in another way.

There are way too many cases of a meaningless bloatware in human history, including writings produced by Hegel, Marx and Engels. There are millions of people suffered because these graphomans have produced 4000+ pages of so-called [political] philosophy, full of pure abstractions, abstract concepts and meta-phisical design patterns. The shit doesn't fly, except for confusing minds of bunch of lesser idiots, which ruined whole nations afterwards.

On the other hands, there are writings after "down to earth" guys, such as Buddha or Christ, or to lesser extent, the guys who wrote Upanishads (which uses rather poetical language) which literally saved, or at least improved, billions of lives. In the realm of philosophy, guys like Tomas Hobbes and Adam Smith wrote much less pages and described some aspects of reality way better.

Piling up layers upon layers of disconnected from reality crap of wrong abstractions and dubious abstract principles, praised by brainwashed followers, especially because they are too bogus and too abstract, is a way to ruin.

I think it is not too hard to notice rather striking similarities.)

qwertyuiop924 · on March 12, 2016

No. Not acceptable. You seem to have forgotten what a VM IS. It stands for VIRTUAL MACHINE. The idea is that you can run the VIRTUAL MACHINE on top of ANY PHYSICAL ARCHITECTURE, and have the applications work, so long as the machine's basic assumptions are followed (i.e. there's some kind of I/O, and a screen capable of displaying graphics, for most of them). A well designed VM can be re-implemented anywhere, and have the software run on it just work. Just look at the Z-machine, or the squeak VM.

The problem with a modern CPU+OS as a VM is that it cannot be re-implemented on other hardware effectively. you can't pull a piece of software designed to run on the x86+unix "VM" and write a VM to make it run on ARM+Windows. Not fast, not in a way that you'd want to use. Try writing native code translation fast enough that you can run Quake3 without even noticing the difference. That's why VMs exist.

dreamcompiler · on March 12, 2016

The original rationale for the JVM -- "write once, run anywhere" -- no longer exists since nobody downloads applets anymore. The more sensible technology for making code run on multiple platforms still works quite well. It's called a compiler.

qwertyuiop924 · on March 15, 2016

Not if you make OS specific syscalls. Your system has to account for the fact that not everybody runs on the same system. VMs do that, and do it far better than most compilers, when it comes to reliable cross-platform without doing a ton of re-writing.

munificent · on March 12, 2016

> Multi-threading for imperative code is a big mistake, which breaks isolation and results in lock-hell, context-switching nightmare and layers of unnecessary complexity which is impossible to reason about.

Go is multithreaded and imperative. Sure, it has some nice concurrency features to help you untangle it, but all of the dangers of shared memory multithreading is right there.

alextgordon · on March 12, 2016

…as is Swift with NSOperation/GCD. Apple's process isolation tech (XPC) is built upon their multithreading tech (GCD). The two are complementary, not antagonistic.

BenoitP · on March 12, 2016

Doing multithreading on a runtime with automatic reference counting gc and expecting performance is nuts. Every reference assignment is a potential write contention.

It is possible to do it, even sensible when you need it for GUIs for example. Just don't expect it to use the hardware in a sensible manner.

dschiptsov · on March 12, 2016

Code generating backend for SBCL, like one in LLVM?

CyberDildonics · on March 12, 2016

webasm is a strongly typed AST with manual memory management, it is not meant to be a direct analog to lisp or a lisp interpreter.