Frances Allen wrote "A Catalogue of Optimizing Transformations" in 1971. 50 year...

sanxiyn · on Aug 6, 2020

I think you should read it if you work on compilers, even now. But the reason is a bit different.

It was the survey of the state of the art at the time, but obviously it is not the state of the art now. Then why should you read it? Because it is written in two layers.

The first layer goes, we tried many optimization ideas, but only these were effective in practice: inlining, register allocation, etc. Others were not. Surprisingly, this layer is still mostly true today! This is both happy and sad depending on your view. Personally I think it testifies that compiler is a mature field, and it matured by 1970. (And that Frances Allen did lion's share of work maturing it.)

The second layer is, so here is how you should do inlining, register allocation, etc. While this layer is also full of gems, it is necessarily badly outdated. The paper predates graph coloring register allocation, for example. On the other hand, ironically, the state of the art 1970s algorithms are often a good choice today when you want an algorithm that is fast and low memory. (Ironic, because they were slow and high memory at the time!) This doesn't apply when there is an important new concern, for example cache locality, but happily it mostly doesn't affect compiler.

I think there should be a project to write the-state-of-the-art-in-1970s compiler. It would be a great simple and minimal alternative to GCC and LLVM, and it would also work great as a JIT code generator. We probably should name it Fran.

sitkack · on Aug 6, 2020

> On the other hand, ironically, the state of the art 1970s algorithms are often a good choice today when you want an algorithm that is fast and low memory.

Simple things can run at higher speeds than complex things. Simplicity should be an architectural element, not a quality, but a feature.

Your idea for the Fran compiler is excellent. Throw in METAII from Dewey Schorre and it can be language agnostic.

mpweiher · on Aug 6, 2020

Maybe I should call the backend for Objective-S Fran :-)

> graph coloring register allocation

I knew some people working on linear optimization, at the time one of the most performance-intensive applications around (with real money/competitiveness in many industries riding on it). The compiler that produced the best code, by quite a bit, was the IBM FORTRAN compiler (3090 was the preferred target at the time), which also didn't do graph coloring. It just allocated the registers by loops from innermost to outermost.

¯\_(ツ)_/¯\

For a simple compiler, we should also look at Wirth's works, and Wirth's rule: an optimization has to pay for itself, that is, the compiler compiling itself with the optimization both in the executable and in the source code has to be faster than without it in both places.

scott_s · on Aug 7, 2020

> For a simple compiler, we should also look at Wirth's works, and Wirth's rule: an optimization has to pay for itself, that is, the compiler compiling itself with the optimization both in the executable and in the source code has to be faster than without it in both places.

That will result in a compiler best at optimizing programs that behave like compilers. Which may be fine for simple compilers, but it's worth keeping in mind that will not be a good compiler for domains where the programs don't behave like compilers (such as scientific computing, or even ML these days) that also want high performance.

MrBuddyCasino · on Aug 7, 2020

So that is were this Golang rule comes from.

titzer · on Aug 6, 2020

I agree with most of your comment but I don't agree that compilers was a mature field by 1970. That was before SSA, graph coloring register allocation, and even linear scan, all three of which were pretty revolutionary.

I saw her 2007 Turing Award speech. I have tremendous respect for Fran and like your idea to name a codegen after her :-)

rudedogg · on Aug 6, 2020

Link to the paper: https://www.clear.rice.edu/comp512/Lectures/Papers/1971-alle...

raverbashing · on Aug 6, 2020

Did the concept of SIMD even exist in 1971?

sanxiyn · on Aug 6, 2020

Vector processors date from 1960s, but the neceesary techniques for loop analysis (using Fourier-Motzkin method) was not found till 1990s.

tjalfi · on Aug 6, 2020

I think you are mistaken about this.

Vectorizing compilers have been around since at least the late 1970s; the Cray 1[0] had a vectorizing compiler in 1978.

Lamport’s Parallel Execution of DO Loops[1] was published in 1974.

[0] https://en.m.wikipedia.org/wiki/Cray-1#Software

[1] http://lamport.azurewebsites.net/pubs/do-loops.pdf

p_l · on Aug 6, 2020

Flynn's taxonomy was codified in 1966

amw-zero · on Aug 6, 2020

I can’t think of any noteworthy idea in computing that didn’t exist in 1971.

derefr · on Aug 6, 2020

The Single Static Assignment form was only created in 1988, which is apparently so late that compilers are still discovering it (and the optimizations it enables) as some novelty today.

(Never even mind the Single Static Information form.)

I wonder if part of the reason SSA is not implemented from the start by many compilers, is precisely because it came too late to be included in seminal works like A Catalogue of Optimizing Transformations; so people that rely on those works as a canon of "classes of optimization techniques that work" won't even be aware of it.

-----

More snarky: the earliest dataflow analysis paper I can find is from 1972. :)

aidenn0 · on Aug 6, 2020

I think SSA is not implemented from the start by many compilers is because many compilers are 30+ years old.

More seriously, my recollection (having been about 20 years since I last took a compilers course) is that it took a while for even researchers to be convinced of the advantages of SSA, as the transformation causes a quadratic blowup in code size for the worst case (but it turns out to be less in real-world cases).

Also Sussman & Steele proposed CPS in 1975 which is closely related to SSA.

derefr · on Aug 6, 2020

A 30-year-old compiler would still have come out in 1990, well after SSA-based optimizations were discovered. But maybe I'm expecting too much in thinking that the author of a flagship optimizing compiler should be up-to-date on CS research in compiler optimization, before deciding the architecture of their compiler.

(Also, plenty of the compilers and/or JITs that I'm talking about are far newer. The first attempt to get a JVM to use SSA optimization during JIT — within SafeTSA — only occurred in 2000. Such an approach was copied by pretty much every JVM implementation by 2005, suggesting that a legacy of incompatible architecture was never the problem, but rather that JVM implementors just didn't think it was a worthwhile technique until they saw it demonstrated for their particular workloads.)

CPS is closely theoretically related to SSA (it carries equivalent information across lexical boundaries) but CPS isn't a basis for optimization transforms in the same way that SSA is. You can't hoist loop invariants using CPS... as far as I know, at least.

titzer · on Aug 6, 2020

> But maybe I'm expecting too much in thinking that the author of a flagship optimizing compiler should be up-to-date on CS research in compiler optimization, before deciding the architecture of their compiler.

In 1990 I don't think SSA was broadly considered to be the basis of a good optimizing compiler the way it is today. So the "author of a flagship optimizing compiler" would be gambling on an unknown that looked good in a couple new papers. The Cytron paper on how to efficiently compute it came out in 1991. So SSA was still a WIP.

It wasn't really until the late 1990s IMO, that SSA became widespread and into the 2000s when it became the standard.

aidenn0 · on Aug 6, 2020

Someone else already replied to your first paragraph, so I'll reply to your second:

The safe TSA papers from 2000-2001 were the papers that showed that SSA was practical in a JIT environment. The fact that this approach was adopted less than 5 years after those papers come out suggest that compiler authors are in fact reading research papers.

Rearchitecting a production JIT to use a different IR for theoretical gains when nobody's demonstrated those gains on a toy JIT is high risk and demonstrating those gains on a toy JIT is research, not development.

p1necone · on Aug 7, 2020

I think a lot of now big, "professional" projects started as people's non serious hobby projects.

If you just want to write a compiler for fun, I can understand not reading up on state of the art theory beforehand. And then it's accidentally actually useful to other people and it's too far into development to change the fundamental architecture.

tjalfi · on Aug 6, 2020

VLIWs[0] and trace scheduling[1] were both from the early 1980s.

Dependence analysis, automatic vectorization, and automatic parallelization were invented after 1971.

Fast algorithms for computing dominators[2] were late-70s.

Graph coloring register allocators were introduced in the early 1980s.

[0] https://en.m.wikipedia.org/wiki/Very_long_instruction_word

[1] https://en.m.wikipedia.org/wiki/Trace_scheduling

[2] https://en.m.wikipedia.org/wiki/Dominator_(graph_theory)

dboreham · on Aug 6, 2020

Agree. I worked on MIMD systems in the 1980s that seemed cutting edge at the time. Then in 2001 I toured the Concrete, ND phased array radar facility where I saw a 16-cpu machine built by Western Electric in 1969 that had essentially the same architecture.

sli · on Aug 6, 2020

I definitely do not have the chops for this, but I would love to see some sort of visualization that traces major, seemingly-modern computing ideas to their roots.