Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Higher-order abstract syntax (wikipedia.org)
76 points by adamnemecek on Aug 16, 2016 | hide | past | favorite | 19 comments


Based on my limited knowledge, SSA form is at a higher level than AST (right?). If yes, would SSA be considered an example of HOAS?

Regardless, besides the question posed by OP, I wonder what the relation between SSA and HOAS might be?


ASTs and SSA are orthogonal concepts. ASTs are a representation of syntax (including SSA) where immediate subterms are easily accessible (unlike with strings as representation). Easy access to subterms is important for the efficiency of type-checkers and code-generators. In contrast, SSA is a simple programming language that is often used as an intermediate form in code generators.

HOAS is a form of embedding a language L with binders into a language L-Base also with binders and functions, but such that L-binders are handled by functions in L-Base and the problem of defining capture-avoiding substitution for L-terms is reduced to reusing L-Base's capture-avoiding substitution. (NB L = L-Base is possible but not required.)

HOAS is neat, but (simplifying a bit) prevents you from making inductive definitions on L terms. There are various ways of getting around this shortcoming, the most principled one is probably the nominal approach of A. Pitts et al. Another problem with HOAS is that the representation may contain 'junk'. That means there are L-Base terms that do not represent L-programs.


HOAS representations can only contain junk if you have not taken appropriate precautions. See the paper "Boxes Go Bananas" for one technique.


Yes, that's right. But you need parametric polymorphism for "Boxes Go Bananas" which may not be available in some languages.


I'd be very interested to hear opinions of people who've actually used this and if so for what?


It's quite ingenious but it's not as convenient as locally nameless representation for working underneath binders. In Haskell if you want to do variable binding you ought to use the bound library[1]. It's been seriously used in at least one compiler project[2].

[1] https://hackage.haskell.org/package/bound [2] https://github.com/ermine-language/ermine


The Rust compiler uses DeBrujin indices to refer to variables at some point in the compilation process.


But that's pretty much exactly what HOAS is avoiding ;)


HOAS is particularly nice as a surface representation for deeply embedded DSLs, especially when combined with GADTs or tagless-final style. But you'll likely want to transform to e.g. DeBruijn indices for optimisations, low-level codegen etc


It is nice for languages where it is a first-class concept (e.g. Twelf, which is a logic/theorem language), but it gets complicated very quickly if you want to do AST transformations.

I cannot recommend HOAS outside of very narrow areas.


The [Beluga language](http://complogic.cs.mcgill.ca/beluga/) is a variant of Twelf which aims to make such transformations much easier.

Full disclosure: Brigitte Pientka was my supervisor. I did not work on the compiler, but I did make the language's logo.


I've found HOAS to be a convenient notation for higher order functions in DSLs in Coq [1] and Python [2], but both use cases were to reason about lambda-calculus.

[1] https://github.com/fritzo/hstar/blob/6df3347/src/DeBruijn.v#... [2] https://github.com/fritzo/pomagma/blob/ada575c/src/reducer/s...


It's been used quite a bit: http://okmij.org/ftp/tagless-final/JFP.pdf


If anyone want another concrete example of a tagless interpreter in OCaml, I wrote one on SO awhile back:

https://stackoverflow.com/questions/22676975/simple-lambda-c...

Candidly, I find them a pain to write, but it is faster and integrates with the host language's type checker.


I've used it a few times. It's a cute way to make shallow DSLs which need to identify binders. It gets very annoying to do any work under the binders however making it scale poorly to many tasks important in more complex DSL implementations.


I think you mean deeply embedded DSLs (as per my earlier comment)? Shallow DSLs implement semantics directly, having only one interpretation and don't have abstract syntax (the AS in HOAS).


I suppose I don't really define deep and shallow in any precise way. I'd say what you describe is a "shallower" DSL than any one using HOAS, but using HOAS will be shallower than one that has an explicit binding structure.

Maybe that's pretty non-standard. I like thinking of shallow-deep as a continuum, though.


These terms do have fairly precise meanings in the FP community; there are many papers on the subject that define shallow versus deep as I described. For example: http://www.cs.ox.ac.uk/publications/publication7584-abstract...


Yes, I'm aware of those papers. I've read many including the one that you linked. As far as I'm aware, while you can use the strict definition (is there, in the metalanguage, the creation of an AST?) they're often discussed as more of a continuum.

HOAS is a great example. There is the creation of an AST, but perhaps the most important and tricky part of any AST, the binding system, is left to the metalanguage. For that reason exactly I'm happy to say that HOAS is "shallower" than, say, a de Bruijn indexed binding system.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: