Why physicists still use Fortran

bmh100 · on Dec 18, 2016

Fortran is an excellent choice for numerical computation. Its non-aliasing assumption allows for compiler optimizations that dramatically speed up computations. It is even easy to get started with concurrent code through OpenMP. When I need to run a simulation or a discontinuous optimization algorithm, I turn to Fortran first.

Here are some resources I have collected over the years.

"Introduction to Modern Fortran" (Fortran 2003): http://people.ds.cam.ac.uk/nmm1/Fortran/index.html

Fortran Wiki: http://fortranwiki.org/fortran/show/HomePage

List of Fortran tutorials: http://www.fortran.com/the-fortran-company-homepage/fortran-...

Fortran vs. C: http://www-cs-students.stanford.edu/~blynn/c/fortran.html

Fortran's influence on C's "restrict" keyword: http://www.drdobbs.com/the-new-c-it-all-began-with-fortran/1...

pmiller2 · on Dec 18, 2016

Glad this was posted. First thing I did on loading the comment section was ^F "alias". Fortran is still used for numerics because it makes writing fast numerical code easy. C simply wasn't designed for it, just as Fortran wasn't designed for low level system/OS programming.

smarnach · on Dec 18, 2016

In C99, you can use the "restrict" keyword to tell the compiler an array is not aliased, and even without restrict it's often possible to get the same optimisations in C or C++. When I was still doing HPC in academia, all our C++ code was maxing out the memory bandwidth. I think in practice the difference rarely matters these days.

wumpus · on Dec 19, 2016

Insomuch as I've seen HPC code written by science academics, both when I was one and later when I became an HPC guy, as much as I cringe at some of the Fortran, the C/C++ can be far worse... and I wish that I could retroactively restrict some of my colleagues and customers to sticking with Fortran.

wumpus · on Dec 19, 2016

Wow, look at the downvotes! I've been saying this to people for 20+ years, in conference talks and face-to-face conversations, and I've never had a negative reaction. Happy to learn if folks think I'm wrong about my personal experiences. If you've had a different experience, happy to learn about that too (although I'd be surprised if you'd downvote me for that.)

(edit: thanks for the upvotes! and discussion!)

GFK_of_xmaspast · on Dec 19, 2016

Personally I'd rather see academic c than academic fortran; academic c++ is in the 'oh please no' category.

wumpus · on Dec 19, 2016

Yes, I've seen too many unfortunate template libraries, which never work in combination with a different template library.

ryescienceguy · on Dec 19, 2016

I've used some quite reasonable academic C++ for geometry processing and meshing.

wumpus · on Dec 19, 2016

Note that I'm not saying that science academic C++ is never great; I'm saying that it's often unfortunate. Is the code you're talking about code that's progressed through several maintainers without becoming sad? That's often where things go wrong, not the initial code, but keeping it going, cleanly, through multiple collaborators / post-docs / grad students.

I agree that meshing is definitely a difficult situation; I've never seen a solution (for unusual geometry) that I liked much. Things that can pretend to be rectangular? No problem. Very adaptive meshes? Argh.

ryescienceguy · on Dec 19, 2016

Hard to say, but I'm looking at 39 contributors, 539 stars, 3,005 commits [1]. Nicely documented, pretty extensive list of users, CI on Linux and Windows. But with that said, yes, I've also dealt with absolutely horrendous academic C++.

[1] https://github.com/libigl/libigl/

wumpus · on Dec 19, 2016

Are there any users that are science academics? I wasn't talking about computer scientists / computer graphics / visualization researchers at all in this thread; they do great with C++. I clicked on a bunch of links of the users of this library and didn't see any physics, astronomy, chemistry, fluid dynamics, etc.

ryescienceguy · on Dec 19, 2016

Not sure! We are using it in (non-academic) CFD work.

wumpus · on Dec 19, 2016

So it sounds like you're somewhat like the folks I'm talking about, even if the other users of this code are not... but have you had a lot of turnover of your coders, like grad students and postdocs that turn over every 2-4 years? Because that was the full nature of my complaint: science not-computer-scientists over several maintainers doing better in Fortran.

kazinator · on Dec 20, 2016

You can also quite simply use local variables, and unroll loops manually:

   for (i ...) {
     double a0 = a[i];
     double a1 = a[i+1];
     double b0 = b[i];
     double b1 = b[1+1];
     c[i] = a0 + b0;
     c[1+1] = a1 + b1;
   }

The hand-unrolling is predicated on the knowledge that the arrays do not overlap. It is obvious to the compiler that the locals do not alias with the arrays, or with each other.

The stores to c[] could alias with a[] or b[], but that hardly matters; we assume that all the array loads and stores take place. We have done ourselves the unrolling that was hindered by the suspicion that the arrays overlap, and could do more of it.

When coding in C for tightness, it helps to pretend that locals (at least basic type scalar ones) are registers and references to arrays and structs are memory loads and stores. Then think like an assembly language programmer, somewhat. It's better to use a few more locals than a few more loads and stores. Try to load once, process with locals, store once.

flamedoge · on Dec 19, 2016

yes, but restrict is limited in that it can only be used to tell two pointed memories could never alias. Finer grain aliasing can be done using unions and structs, but it's not as flexible as Fortran and the possibility of non-aliasing is easily broken just by allowing address to be taken.

new299 · on Dec 19, 2016

As an aside, how to you accurately measure memory bandwidth? I've been looking for a good solution but couldn't find one.

pandaman · on Dec 19, 2016

Some processors expose performance counters but you don't need to accurately measure memory bandwidth to know you are maxing out. Suppose you know that the memory bandwidth is 1Gb/s and you program reads 100Gb of data to produce the result. If the execution takes ~100s then you know you hit the maximum memory bandwidth.

bmh100 · on Dec 18, 2016

Were you using C99 during your HPC programming in academia?

smarnach · on Dec 18, 2016

I used C99 for the computing kernel of one standalone project (simulation of gravitational lensing). Apart from this, it was mostly C++.

bmh100 · on Dec 19, 2016

What made your work so dependent upon memory bandwidth?

wumpus · on Dec 19, 2016

As an introduction, you might want to look into John McCalpin's history as a shallow-water oceanographer and his subsequent fixation on the STREAM benchmark. TL;DR: some algorithms aren't friendly to caching.

I have no idea if this applies to smarnach, but it definitely applies to other people.

jabl · on Dec 19, 2016

John McCalpin did a nice presentation at SC16 giving an overview of memory BW/latency issues in HPC, slides: http://sites.utexas.edu/jdm4372/2016/11/22/sc16-invited-talk...

nickpsecurity · on Dec 18, 2016

Your point is true but I thought you Fortran users might find the OS below interesting since they wrote it in Fortran. Only one I know of.

https://en.m.wikipedia.org/wiki/PRIMOS

emmelaich · on Dec 19, 2016

Nice bit of trivia! I was an admin of a Primos machine briefly while also doing scientific programming.

Primos wasn't much good compared to the commercial Unixes that were appearing. It did have one nice-ish feature late in it's history; the CPL scripting language.

draegtun · on Dec 20, 2016

Wow, big trip down memory lane here because my first job was on a Prime mini-computer! Wrote a complete system using CPL scripts (with the occasional bit of Fortran thrown in) on Primos.

Shame to hear that Prime / Primos is now defunct but that job was back in the 80's!

fovc · on Dec 18, 2016

What does non-aliasing mean in this context?

bmh100 · on Dec 18, 2016

It means that the compiler knows that pointers/arrays don't overlap in memory. Data can therefore be safely stored in registers, instead of continually re-read from memory, to account for possible alterations between computations [1].

[1]: http://stackoverflow.com/questions/146159/is-fortran-faster-...

jack1243star · on Dec 19, 2016

I wonder if Rust is fundamentally able to take advantage of its borrow checker and outperform C in this regard.

shoo · on Dec 18, 2016

https://en.wikipedia.org/wiki/Pointer_aliasing

math_and_stuff · on Dec 19, 2016

As a counterpoint, it is worth noting that essentially no high-performance BLAS or LAPACK implementations (e.g., GotoBLAS/OpenBLAS, ATLAS, BLIS, etc.) are implemented in Fortran and instead tend to use a combination of C (in order to use vector intrinsics) and assembly.

castratikron · on Dec 18, 2016

To be fair, there are GCC built-ins that allow you to tell the compiler that arrays are nonaliasing, but you're right in that it's more difficult in general to get a compiler to spit out fast numeric code with C than it is with Fortran.

bmh100 · on Dec 19, 2016

I feel like it's a "right tool for the job" situation. Fortran was literally built for the purpose of numerical computation. Multi-dimensional arrays, non-aliasing arrays and pointers, multi-dimensional array intrinsics, intrinsic array binary operators... all come with Fortran right out of the box. Of course, C/C++ can do all of these things, but all of the proposed methods, at least that I have seen, feel like they are trying to hammer a screw.

Asooka · on Dec 18, 2016

I have one question - if I compile and link a program written in C and Fortran, can the linker inline Fortran functions in C? I know it can do it if I link object files produced from C source, but can it do it if I link object files from separate languages?

wumpus · on Dec 18, 2016

Most Fortran compilers are actually Fortran/C/C++ compilers, and cross-link and cross-inline with no problems.

guitarbill · on Dec 18, 2016

Having never done it with Fortran, I don't know. It should be possible, because e.g. on Linux, everything adheres to the ABI. However, the ABI doesn't specify how to interpret pointers AFAIK. Array ordering [0] and indexing is different in Fortran compared to C, so I'm sure there's pitfalls you'd have to account for.

[0] https://en.wikipedia.org/wiki/Row-_and_column-major_order

dfgasfjhjhsdojj · on Dec 19, 2016

Fortran arrays can be indexed however you like. Code that is designed to be called from C can be written with 0 indexed arrays.

guitarbill · on Dec 19, 2016

Sure, but you still have to remember to do so :)

Annatar · on Dec 19, 2016

It's not the link editor's function to inline; an optimizing compiler does that. A linker can, eventually and depending on the platform, perform some binary optimization, as is for example the case with Sun Studio compilers on SPARC (-xlinkopt=2), or with GCC (-flto), but that's not inlining.

jabl · on Dec 19, 2016

To nitpick, compilers nor linkers typically are not able to do "binary optimization", if by that you mean analyzing the asm and optimizing that. What happens when you compile with -flto or equivalent is that the compiler hides some internal representation in a special section in the object files, then at the link stage (if you also use -flto when linking) is that the linker invokes the compiler again which deserializes those trees from the object files, and does the inter-procedural optimizations on those.

Animats · on Dec 18, 2016

Fortran is still better at multidimensional arrays. Other than Matlab, few other languages do multidimensional arrays at all. C-derived languages use arrays of arrays, and use templates, macros, or objects for multidimensional arrays. Fortran optimizers have lots of multidimensional array-oriented optimizations which other compilers usually lack.

I tried to talk the Go and Rust people into putting in multidimensional arrays, but the discussion degenerated into bikeshedding, with arguments for fancy but slower representations so you could take arbitrary N-dimensional slices from a multidimensional array.

brson · on Dec 18, 2016

As I member of the Rust team, I personally would like Rust to be a suitable replacement for fortran, and I suspect most Rust core contributors would feel the same. Rust already has competitive performance so it feels like a small step to be able to fit into fortran's niche. I don't know the particular multidimensional array discussion you experienced, but evolving a language takes slow time, and meanwhile people experiment with what is possible out-of-tree, like with the ndarray library, which seems to be well respected.

Right now there is a focus on SIMD in Rust, which is crucial for this domain, so there is progress being made.

luthaf · on Dec 19, 2016

> I don't know the particular multidimensional array discussion you experienced, but evolving a language takes slow time, and meanwhile people experiment with what is possible out-of-tree, like with the ndarray library, which seems to be well respected.

A Fortran-like multidimensional library could effectively happen out of tree, but ndarray is closer to Python's numpy.ndarray than to Fortran arrays. And this means that ndarray can not support negative indexing (see here: https://github.com/bluss/rust-ndarray/issues/152). I started a prototype of Fortran like array here: https://github.com/Luthaf/mudi, if anyone want to look at the code or even work on this.

My only problem at the time was that `Index` must always return a reference, and cannot return a new struct. From my understanding, this could be solved with the HKT/ATC work.

> Right now there is a focus on SIMD in Rust, which is crucial for this domain, so there is progress being made.

This is very exciting! And knowing the Rust community, we may even get a cross-platform cross-architecture SIMD library with a nice API!

steveklabnik · on Dec 19, 2016

The SIMD stuff is extremely complex. The focus for now seems to be to _not_ work on a nice API, and tackle the problems with just getting the absolute basics done in an acceptable way. Higher level stuff will come later.

Manishearth · on Dec 19, 2016

So in my previous life as a physics student I did talk to many physicists about why Fortran was popular (and later when I discovered Rust, discussed Rust with them).

The main blocker is just that fortran has a wonderful ecosystem of scientific computing packages and you can find almost any operation you need out there. This is not true of even Matlab or Mathematica. I have often had to reimplement random algorithms in Mathematica. Fortunately mathematica gives some pretty good high-level tools, so it's not that bad, but it's still annoying.

Multidimensional arrays was brought up as an issue with C/++. AIUI using macros in C++ makes it tolerable, but not great. I did ask about `arr[(1, 2, 3)]` and folks seemed okay with it. Many have used mathematica where indexing is done with double braces (`arr[[1,2,3]]`) so it isn't too bad.

Implicit bounds checking was something that was frowned upon. Could even be a dealbreaker. While usually bounds checking can get elided by llvm, scientific computing does tons of indexing and the overhead might crop up in the profile. I'm not actually sure of how well this maps to the real world, and never got a chance to profile this, but it's a concern.

Libraries like ndarray could turn it off but it would basically be unsafe :|

-------

I did use Rust for some scientific computing. At the time, IIRC numeric types had to be more explicit, which was annoying, but that was pretty much my only issue with it (I didn't need any existing scientific libraries). I didn't know of any good visualization libraries so I would usually output to json/csv and visualize in Mathematica. But I have done the same for Fortran code; Mathematica is just awesome at visualization and analysis.

I recall having access to a scientific computing cluster where the only up to date thing was ifort. It had an ancient python, and icc (IIRC it was ancient too, but I'm not sure). Had to compile a bunch of compilers myself to get my own code to run. Asked about this and apparently everyone just used fortran. Python is common in post-processing but a lot of the tools they were using were ancient or maintained by people with similarly ancient toolchains so stuff still worked on it.

ryescienceguy · on Dec 18, 2016

> I tried to talk the Go and Rust people into putting in multidimensional arrays, but the discussion degenerated into bikeshedding, with arguments for fancy but slower representations so you could take arbitrary N-dimensional slices from a multidimensional array.

This is the most frustrating aspect of Rust, to me, as a developer of scientific software. Rust has great potential to challenge Fortran in this domain, but numerics is mostly an afterthought in Rust.

stymaar · on Dec 19, 2016

> Rust has great potential to challenge Fortran in this domain, but numerics is mostly an afterthought in Rust.

A common problem with Rust: many people from different backgrounds sees it as a godsend (for some good reasons) and they'd like to replace their former language with it, but the language can't evolve fast enough to satisfy everybody at the same time.

It started as a language for desktop softwares (like a web browser, it's first use-case), and currently the focus is to evolve it to be suitable for back-end development (with asynchronous IO). I hope one day scientific software will become a first class citizen too.

steveklabnik · on Dec 19, 2016

While this is true in some sense, there's also a difference between the language changing to support a use-case, and the library ecosystem. The language doesn't _need_ to change for the async io stuff, though there is one feature that will be extremely helpful, "impl Trait". That feature is also needed for other reasons too, so it's not solely motivated by the async io stuff.

It is totally true that there are many areas that can be improved, and it's gonna take time and effort to knock them out. It's tough! My personal pet area is OS dev...

koverstreet · on Dec 18, 2016

Can you go into a bit about what specifically is missing?

A lot of the people who work on the compiler/library support for this stuff aren't the primary consumers of it. I personally like working on this kind of code when I have time (I've done a bit of work on the rust bignum library), but it helps to know what's actually going to be useful to people.

m_mueller · on Dec 19, 2016

Basically, OPs article gives a good overview on what scientists want out of a language used for large scale numeric simulations.

1.) Multi-dim arrays supporting array math, slicing and flexible indexing, sizeof always known (internally solved through lookup tables).

2.) Fast-by-default rather than safe-by-default. A safe language is useless if it gives 2x or more runtime overhead. Make it safe where it doesn't give you any performance penalty (i.e. compile-time checks), otherwise make the default performant or leave the choice to the programmer. Example: Pass-by-reference as default, no GC.

Essentially, I don't think Fortran will be replaced any time soon - maybe strong AI that can program all of this stuff from pure math specification will be here sooner than a Fortran replacement. To everyone not understanding this, read the article again. There's been a huge amount of effort going into this language the last 20 or so years - almost more than justified by the size of the HPC market, which is visible by all the compilers lagging behind the specs. If you want to help, go and improve gfortran (e.g. with better GPU support so people don't need the lagging PGI or locked down Cray compilers anymore).

throwaway7645 · on Dec 18, 2016

Agreed. I want to write numeric software, but don't want to use Fortran or C++. Rust looked great, but I couldn't find anything in this domain.

bachmeier · on Dec 18, 2016

There's a lot of activity for D in this area. For instance, there's a BLAS replacement[1] with good performance. Supporting numerical programming is a priority for Walter and Andrei. There's more work to do, but it is adding enough functionality that it will be a realistic option for those that like the language.

[1] http://blog.mir.dlang.io/glas/benchmark/openblas/2016/09/23/...

m_mueller · on Dec 19, 2016

Why don't you want to use Fortran?

If you want to get started, here's my recommendation: Program Kernels only in Modern Fortran, learn about f2py bindings and do the other parts in NumPy. If you use Fortran purely as a number crunching language - no I/O, no setup, no config code etc. - it's pretty sweet actually - IMO there isn't much need for replacing it with something new that has 50 years of compiler work to catch up (which is near impossible anyways).

keldaris · on Dec 19, 2016

Not the OP, but a computational physicist writing a fairly decent sized codebase for a long term project, just went through the language choice and initial prototyping phase. I have extensively used Julia, Python, MATLAB, C99, C++98/11/14, Fortran and D for work and I believe in picking the right tool for the job.

I chose C++ (writing in a fairly modern style, but not eschewing aliasing-restricted pointers, etc. where beneficial) because it handily beats everything except Fortran for performance and is vastly superior to Fortran in terms of zero-cost abstraction. I'm writing code that's largely portable between CPUs and accelerators (mostly GPUs) and has a lot of interchangeable parts. With C++ I can use templates to achieve compile time polymorphism and genericity, while retaining Fortran level performance on CPUs at the cost of higher code complexity.

The Fortran story on GPUs has also been very disappointing initially, especially for those of us not interested in proprietary compilers, though I understand the situation is improving there. Ultimately, in my experience, while individual strengths of C++ can be reproduced in other languages, there's no other language that does it all. I hope that changes some day (for now, D looks like the most promising alternative to me) as writing C++ is often an exercise in frustration, but that's where we're at.

m_mueller · on Dec 20, 2016

I'm in a similar situation as you, albeit with a large Fortran codebase that needs to stay - therefore I chose to write a transpiler in python that enables a Fortran codebase to be parallelised on CPU and GPU - see my profile link.

keldaris · on Dec 21, 2016

Hybrid-Fortran looks like an elegant workaround for an ugly problem, kudos for dealing with the storage order as well.

Ultimately, it seems to me the only way to justify using Fortran for scientific computing nowadays (apart from legacy issues, of course) is if you have no intention of ever using accelerators and absolutely must have optimal CPU performance. NVidia and, to a lesser extent, Intel/AMD could have maintained the language advantages Fortran offers in the accelerator space as well, but chose not to, and now you need solutions like this coupled with proprietary compilers just to approach CUDA C / OpenCL performance. Legacy projects will be around for decades, of course, but for new things I think C++ is/will increasingly be the default choice.

goatlover · on Dec 18, 2016

Julia has strong numerical and multi-dimensional array support, but it's JIT compiled. However, once a function is JITted, it has comparable performance to Go & Rust.

divbit · on Dec 18, 2016

Seconded for Julia. I have been using it for math prototyping and it's recently replaced my old favorites like matlab / python / lua.

luthaf · on Dec 19, 2016

Yep, Julia is a very nice language with very good performances. But I abandoned it because it is not (yet?) oriented toward library development, and more usable in exploratory style.

I ran into issues a year ago when trying to organize a library with more than 8000 loc, when the "dump everything into a global namespace" started to be more a hassle and less a nice way to get stuff done. I am waiting for it to stabilize a bit before giving a second try.

photon-torpedo · on Dec 19, 2016

Could you elaborate on why you felt the need to "dump everything into the global namespace"? For quite a long time now, you can use modules in Julia to keep things in separate namespaces. With "import MyModule" (instead of "using MyModule") things will not enter the global namespace, but will need to be qualified, like "MyModule.myfunction(...)". I feel this is very much like Python. You could even introduce abbreviations for the modules, using "const mm = MyModule" (admittedly, Python's "import...as" is cleaner).

luthaf · on Dec 19, 2016

Yes, but `using MyModule` is (was?) the default and preferred way to use a module. Which is equivalent to recommend everyone to do `from module import *` in Python.

dnautics · on Dec 18, 2016

I believe the multidimensional arrays are backended by Fortran

mbauman · on Dec 18, 2016

Not really. They're just written in a way such that the memory layout is compatible with BLAS libraries — which are frequently written in Fortran. So they can be passed directly to functions that happen to have been written in Fortran.

photon-torpedo · on Dec 19, 2016

Additionally, invoking BLAS/LAPACK only happens for dense arrays of specific types (single/double precision real/complex numbers). The great thing in Julia is that almost all of the array and linear algebra functionality is implemented completely generically, so that algorithms can be written using these higher-level concepts, and the low-level functionality will (very efficiently) dispatch to BLAS/LAPACK if possible; if not, the generic implementations still allow your algorithm to work for e.g. your user-defined number type.

dnautics · on Dec 20, 2016

haha yeah I guess i should have been more precise with my wording. I'm currently prototyping a custom number system in julia, and obviously, the multidimensional array ops will not be backended in BLAS.

david-given · on Dec 18, 2016

Has anyone tried numerical programming in Ada?

I haven't played with Ada arrays much, but I know it supports various different representations (so it can be binary compatible with Fortran column-first arrays and C row-first arrays), it's got nice array slicing semantics, it's got good support for generics which can operate on arbitrary sized arrays.

I don't know whether it has Fortran-like aliasing guarantees, though. You can't take an address of an array element unless the array has been declared to support it, which helps, but I don't know whether you can e.g. pass the same array by mutable reference into a function twice. (It makes sense that this isn't supported, as it's kinda meaningless, but I can't find an actual reference.)

nickpsecurity · on Dec 18, 2016

It might have changed in 2005 or 2012 revisions but here is Ada 95's take on that:

http://www.adaic.org/resources/add_content/docs/craft/html/c...

Also check out ParaSail language designed for easy, safe parallel and concurrent programming. It's an Ada variant from Tucker Taft at AdaCore. Might have something interesting on this topic. Might not.

frozenport · on Dec 18, 2016

Rust is like C++ but with better syntax for safer sharing of resources. If you don't like C++ you're not going to like Rust.

jernfrost · on Dec 19, 2016

The Julia programming language which is gaining a lot of popularity in scientific computation has in fact multidimensional arrays which are very efficient. The whole idea of Julia is to be comparable to C and Fortran in speed while having the ease of development of Python.

I'd dabbled in Go and Rust as well but don't see them as serious competition to Julia in this space. As much as I love Go, it is really poorly suited for numerical computing. No operator overloading of any sort among other things. You really want that for vector based math.

Also Rust, seems like more of a language for software developer professionals. I think Rust looks neat, but it is too complicated for scientists I think. They don't want to focus that much on their language.

In this respect I'd actually think Swift has more appeal as it is quite fast, but easier to use. By easy, I mean there are fewer concepts to learn.

But Swift will have the same problems as Go and Rust, in that it is not being designed with the needs of people doing scientific computing in mind.

That is what is nice with Julia. Out of the box it does so many of the things you want. Right out of the box without explicitly importing anything you can start doing advance matrix operations, load large CSV files, groks imaginary numbers. It actually has a number system which distinguishes between rational and irrational numbers e.g.

bsder · on Dec 18, 2016

> I tried to talk the Go and Rust people into putting in multidimensional arrays

What facility is Rust lacking that prevents you from coding this as an extension/package rather than needing core coding support?

Native long integer (128-bit) support could be coded as a compiler extension for a long time before it got pulled in.

Animats · on Dec 18, 2016

What facility is Rust lacking that prevents you from coding this as an extension/package rather than needing core coding support?

That sort of thinking is what drives number-crunching people away from a language. You end up with ten different representations for matrices, and can't pass matrices from one package to another.

bsder · on Dec 19, 2016

If the numerics people can't agree on something, throwing the decision over to the core Rust folks is unlikely to result in something better.

So, the question still stands: What in Rust prevents an ambitious number cruncher with good design taste from adding the appropriate pieces to the language?

wumpus · on Dec 19, 2016

Fortran has 1st class many-dimensional arrays; can you really do that efficiently (as in "works great with a loop nest optimizer/LNO" and "has pretty syntax" and "there's one way to do it") as a Rust extension? C++ says "NOOOOOOO!!!" I don't know Rust well enough to answer the question.

Manishearth · on Dec 19, 2016

You can do `arr[(1,2,3)]` in Rust (not pretty, but ok). Assuming you've marked the indexers as inline, LLVM should be able to optimize it pretty well. If it doesn't, optimizations can be added.

Rust can't enforce "one way to do it" since this would be done as a library. Currently there's only one library offering this (ndarray), so in a sense there's only one way to do it :)

wumpus · on Dec 19, 2016

What does arr[(1,2,3)] have to do with what I asked for? The code that I write for my personal scientific stuff (numerical hydrodynamics) needs loop jamming/fusion and splitting, SIMD without having to use any special notation, and other stuff which is typically provided by compiler LNO. I happen to have interchanged loops in my code, but if I goofed that up, it'd be nice for the compiler to do it for me.

No array slices needed. Plain Fortran 77 notation provides what I need.

Manishearth · on Dec 19, 2016

> What does arr[(1,2,3)] have to do with what I asked for

"has pretty syntax", was just showing what the syntax would look like in Rust.

LLVM autovectorizes, and does many loop optimizations. I suspect LNO exists, but if it doesn't there's nothing preventing it from being added afaict.

Was mostly explaining what the status of this in Rust is. It's not awesome, but not horrible either.

wumpus · on Dec 19, 2016

Thanks for trying to be helpful. The place where C and C++ goes wrong is multi-dimensional arrays. Some googling tells me that narray gives Rust syntax like foo[[1,2,3]] for a 3 dimensional array. I have no guess if this optimizes well or not.

Manishearth · on Dec 19, 2016

It will compile down to `foo[1][2][3]` and get optimized the same way. Rust additionally does hint a lot more about aliasability to LLVM (everything is restrict by default, basically), so there probably is more scope for optimization here.

steveklabnik · on Dec 19, 2016

I'm not an expert in that domain, but to my mind, the two big things are

* lack of type-level integers

* lack of explicit SIMD on stable Rust

Manishearth · on Dec 19, 2016

This is already true in the Fortran world. When I was a student I was working with some condensed matter physics software and there was all kinds of madness with flipping conventions in there.

And Rust doesn't have ten different ways to do n-dimensional arrays. There's one major library out there (ndarray). I just learned about a second (https://github.com/Luthaf/mudi) which seems to use the same conventions.

There are multiple ways to do matrices, but these are in graphics-oriented libraries.

PeterisP · on Dec 19, 2016

Well, it can work if one package gets dominant - e.g. the Python ecosystem works quite well with having all this outside of core language in the numpy package.

tnecniv · on Dec 19, 2016

But that happening is just as likely as the fragmentation that exists in C++. Getting OpenCV to play nice with Eigen can be a hassle.

sidlls · on Dec 19, 2016

Nothing prevents a person from doing that, but that isn't what folks doing scientific computing are interested in, generally speaking. Some are, sure, but in aggregate as a matter of what the "day to day" is, it's just not of much interest.

I mean, maybe I was an outlier, but when I was in academia I wanted to have my models compute as accurately and quickly as possible, and I generally didn't have much time for or interest in the computer coding beyond the necessary implementation details required to conduct my research.

Telling me "implement an extension", then, would have made me roll my eyes and go back to fortran.

jerf · on Dec 19, 2016

I don't think the idea is necessarily that you personally implement an extension, but that "someone" needs to implement it and the bar for "someone" to do so is lower if it can be done via extensions or libraries rather than having to be done in the core. That someone should share their work, of course, and that someone may find they have a feature or two they have to get into core, but even that's easier than sticking a whole whackload of changes into core.

Manishearth · on Dec 19, 2016

You don't have to implement an extension. It already exists.

https://bluss.github.io/rust-ndarray/master/ndarray/index.ht...

computerphage · on Dec 18, 2016

It would be nice to have access to an overloadable index operator that works on multiple dimensions at once (like C#'s [1, 2, 5] syntax). It's sometimes not easy to write something to be efficient if it has to go one-by-one like [1][2][5].

gpm · on Dec 18, 2016

The ndarray library indexes via tuples, i.e. [(1,2,5)], which should be just as efficient as a built in syntax (if perhaps a bit uglier).

spullara · on Dec 18, 2016

Doesn't Rust have macros that let you modify the syntax? Maybe you could do this.

bachmeier · on Dec 18, 2016

One of the nice things about D is that it offers multidimensional indexing. For instance, you can write x[1, 2..6], and you can do it for user defined structs. In my case it makes it really nice for interfacing with R.

dzdt · on Dec 18, 2016

Fortran is also better at complex numbers. In Fortran these are a native type with every bit as much support as for real floating point types. Not so in the top competing languages (C/C++).

jforberg · on Dec 18, 2016

C has native support for complex arithmetic since C99. What kind of support do you think is missing?

http://en.cppreference.com/w/c/numeric/complex

thisrod · on Dec 18, 2016

Last time I looked at C, the libraries had yet to be ported to C99. You had to choose between z = u + 2 * i * w, or zs = fft(ws); you couldn't have both. Therefore Fortran kept winning.

dzdt · on Dec 19, 2016

It is kind of a "too little, too late" problem. C added complex numbers a decade after C++ spun off, so the two languages have different approaches. And C99 complex number support was slow to be added to compilers. On the UNIX side, it was never added to Solaris 9 (EOL 2014); Microsoft finally added Visual Studio support in 2013. And even if your compiler supported it, code generation quality has been historically bad. Some things were never added to the standard, like printf support, so complex double is still a lesser native type than double.

kmill · on Dec 19, 2016

In C, I've seen people do "multidimensional arrays" by A[n * i + j], with A a double[m * n], rather than by arrays of arrays. Especially if the array is being passed as a parameter, and the array's dimensions are given by other arguments, and the array is to be used with BLAS.

Animats · on Dec 20, 2016

Is the compiler smart enough to do strength reduction and get rid of the multiply when indexing? FORTRAN compilers have been doing that for multidimensional arrays since 1954.

srean · on Dec 18, 2016

Guy Steele and his group had a really interesting thing going with the language Fortress (name is a nod to Fortran of course, but fortified). It was meant for high productivity as well as high speed (petaflops range). It got shelved after Oracle bought Sun. It was one of the three languages seeded by DARPA funding for high productivity computing

https://java.net/projects/projectfortress/pages/Home

https://blogs.oracle.com/projectfortress/entry/fortress_wrap...

pjmlp · on Dec 19, 2016

> It got shelved after Oracle bought Sun.

Actually the actual real history is little more complex than that.

JuliaCon 2016 (Keynote) | Fortress Features and Lessons Learned | Guy Steele

https://www.youtube.com/watch?v=EZD3Scuv02g

It would be good if people on HN would separate a bit the Oracle hate with the facts.

srean · on Dec 19, 2016

The causation was certainly not intentionally implied in my comment. I was stating temporal ordering fact. Hold on to your oracle defense response :)

nickpsecurity · on Dec 18, 2016

Look up ParaSail. It's still in development that Im aware of.

hzhou321 · on Dec 19, 2016

It is simply due to apprenticeship. Physicists do research, and their code is research code. Like most other research, the research code is maintained and used in-house. As most in-house non-shared code (in any language), it is impenetrable unless you take apprenticeship. The masters are typically old and FORTRAN is their language and it is the only choice for the apprentices. Once the apprentices graduate and become masters themselves, they further developed their own in-house code and it is inevitably in FORTRAN too. As any over-developed code, it is too expensive to migrate even when the new master recognize the shortcoming. The bottom line, FORTRAN is not bad once you are familiar with (and invested in) it, so FORTRAN persists over another generation of apprentices ...

However, there is slight change. The new generation of masters often do know other languages, so they sometime tolerate their new apprentices to use other language and still able to teach them. So at this point, by far, not all physicists use FORTRAN.

FORTRAN will die out, just takes time.

One of the sticking point is the library BLAS/LAPACK. It is in FORTRAN. Or more precisely, its interface is in FORTRAN. I don't foresee this will change for a long time. So FORTRAN will be like COBOL, it will persist even when no-one really use it anymore.

goerz · on Dec 19, 2016

I think that's only party true. Physicists would probably not be exposed to Fortran if it wasn't for "apprenticeship". But the same would be true for any other very domain-specific language (let's say Mathematica). However, that doesn't change the fact that Fortran (in its modern syntax) is a great language for numerics. Its abstractions are extremely well-suited for any computation focusing on multidimensional arrays. It is certainly a more expressive language for numerics than C/C++ (and it's not as easy to shoot yourself in the foot in Fortran). There's not that many other languages that can compete for high performance applications. Julia looks like it can do some interesting things. Python is great as a glue language (but not for number crunching, and of course it's slow). I would still (and regularly do) choose Fortran for new projects. It's not going away anytime soon.

hzhou321 · on Dec 19, 2016

Let's see how I may clarify the argument.

First, if one is familiar with a language or invested in a language, he will be biased to choose that language and it makes sense for him. Before you would argue, I admit I can't prove that you are biased, but nor could you prove you are not biased. So let's agree that some anecdotes like your case is not a strong argument.

So let's look at it in another angle. If physicist/people choose FORTRAN not merely due to apprenticeship, it necessarily means that there will be people choose FORTRAN even when his master do not introduce/require him to use FORTRAN. In another word, we need examples of people who choose FORTRAN despite of non-familiarity to FORTRAN. There are plenty of such cases for, e.g. rust. For me at least, I know none such. How about you? For example, how successful have you been influence other people (other than your student) to use FORTRAN? On the other hand, I do know people who choose other language despite of the familiarity of FORTRAN.

goerz · on Dec 19, 2016

Of course knowing a programming language strongly biases someone to choose that language in a project. I also agree you will not find a lot of people (if any) using Fortran who have not been introduced to the language as part of their research, following the recommendation of someone senior (e.g. their advisor). However your original argument was that Fortran was only used because advisors were forcing their students to do so, and that Fortran would die out eventually along with those dinosaurs. I don't think this is true, and I don't think the fact that people generally don't just stumble into Fortran by themselves is any evidence to the contrary. As the original article points out, the reason that Fortran continues to be used is that it is the most suitable language for its domain (but of course not as a general purpose language)

The prerequisite for using Fortran is that someone works in a field that relies heavily on large scale, high performance numerical computation. Since Fortran is pervasive in that domain, it's extremely likely the they would be exposed to Fortran through "apprenticeship". Basically, most people who should be using Fortran are using Fortran! I've seen a few instances where people used C/C++ or Matlab in situations where Fortran would have been more appropriate. Matlab has its own drawbacks, but it is also a very domain-specific numerical language, and likewise its users are exposed to it through apprenticeship! C/C++ is still the default language for high performance computing outside of numerics, and thus it's not surprising that people who have not been exposed to Fortran would attempt to use it for numerical purposes as well.

There isn't exactly a lot of choice for numerical programming: Python and Matlab are (partly) designed with numerical purposes in mind, but don't scale to high performance. C/C++ are performant, but not expressive for numerics. There really isn't any other language that would even be in consideration, besides now Julia, but Julia is extremely new, so time will tell.

I don't know anything about Rust, but based on reading its Wikipedia article its not a language designed for numerics, but basically something designed to replace C++ for systems programming. Thus, like C/C++, it will be performant, but not expressive for numerical purposes.

Annatar · on Dec 19, 2016

FORTRAN will die out, just takes time.

If Fortran is good for computations, and it's fast, and it does what it is designed to do, and does it well, why should it die out?

As a computer scientist and engineer, I do not understand the animosity towards Fortran?

hzhou321 · on Dec 19, 2016

It is not animosity -- I love the simplicity of FORTRAN -- but I am stating my speculation.

C replaces FORTRAN. That is from merit point of view. In reality, computational physicists mostly goes to C++ plus Python. Truth to be told, physicists are not computer scientists and they are not even into it. They simply picked out of necessity/convenience -- which was the reason FORTRAN was picked then.

PS: That FORTRAN is fast is over stated IMHO. I am familiar with high performance computing. In terms of parallel computing, it all boils down to algorithm design. In terms of vector/linear algebra, it all depends on machine tuned BLAS (which is mostly due to hand crafting rather than the language itself). There is no evidence FORTRAN code runs necessarily faster than C -- comparing first craft non-optimized code. Unlike C, FORTRAN is less capable and more cumbersome to do trendy stuff -- this is the actual merit of FORTRAN over C.

Annatar · on Dec 19, 2016

PS: That FORTRAN is fast is over stated IMHO.

From my point of view, it's understated, and should be given much more credit than it gets. intel has done a lot of work to make their ifort compiler leverage their latest instruction sets like AVX2, resulting in even bigger speedups, and PGI has done absolutely amazing work of writing a compiler which can ingest unmodified Fortran code and compile it to run on NVidia GPU's. It's nothing short of mind blowing to me and gets way too little mention. Modern Fortran is a really nice language and all this bashing of it because it's not trendy is completely unfounded. Just ask intel.

dagw · on Dec 19, 2016

I believe that one of Fortran's biggest problem is that if you want to get all the advantages that modern Fortran has to offer you have to use closed source commercial tools. If the open source tools had been closer to what PGI and Intel offers in terms of features and performance I don't doubt that Fortran would be more popular.

hzhou321 · on Dec 19, 2016

Seems I have not been kept up :)

dfgasfjhjhsdojj · on Dec 19, 2016

Fortran has not been capitalized since the F90 standard came out...

justin66 · on Dec 19, 2016

> FORTRAN will die out, just takes time.

Perhaps, but it doesn't follow from anything you've said.

supergarfield · on Dec 18, 2016

> Interestingly, C/C++ beats Fortran on all but two of the benchmarks, although they are fairly close on most.

Unless I missed something, this directly contradicts the quotes benchmark source[1], where C beat Fortran on all tests except one (where they were tied), and usually by a wide margin. Plus I believe gcc usually produces slower code than Intel's compilers (at least for C), so it's not clear that C-gcc and Fortran-Intel is a fair comparison.

But beyond that, I'm always confused as to why there is a significant difference. I would expect two compiled languages with reasonable compilers to perform almost exactly the same when they're doing nothing but arithmetic operations, since those are almost entirely determined by the CPU, not the cleverness or speed of the standard library algorithms. Can anyone enlighten me?

[1] https://benchmarksgame.alioth.debian.org/u64q/fortran.html

bmh100 · on Dec 18, 2016

There are some important things to note about these benchmarks. On the Mandelbrot set, I noticed that the program does not use arrays despite the obvious 2D-array structure of the bitmap. Two of the four cores were also underutilized. Next, the k-nucleotide, reverse-complement, and fasta programs only utilized a single core, compared to the C programs utilizing multiple cores. It is not clear to me why this should be the case, but it leads me to cast doubt on the tests I listed.

igouy · on Dec 19, 2016

Simple -- those are the programs that have been contributed.

https://benchmarksgame.alioth.debian.org/play.html

bmh100 · on Dec 19, 2016

> You will probably come across people saying that the programs are not idiomatic ("enough").

This seems to describe me. I am tempted to rewrite those programs more idiomatically. Thank you for pointing this out.

igouy · on Dec 19, 2016

Please do contribute programs to show your notion of idiomatic!

emmelaich · on Dec 19, 2016

Isaac, thank you for your (almost :-) indefatigable work for and defence of the benchmarks game.

igouy · on Dec 19, 2016

You're welcome.

I have cut back: x64 single core & x86 quad core are no longer measured, x86 single core updates are not frequent, and x64 quad core updates are less frequent than they used to be (although more frequent than the chart dates might suggest).

I've kind-of been hoping the hardware would fail -- but someone would probably gift me a new machine :-)

From my perspective "defence of the benchmarks game" is a matter of saying "Keep it in perspective" and "Keep it in context" until we do ;-)

CJefferson · on Dec 18, 2016

One big difference is the rules on pointers.

In C, if you write code like this:

  void dostufftotwoarrays(int* a, int* b);

Then maybe b is equal to a + 3, so your compiled code has to assume whenever it writes to an element of a, it might have changed an element of b, and so reload b from memory. Alternatively, it can do a bunch of tests at the start of the function, and compile multiple copies of the function for different situations.

In FORTRAN, this isn't possible (you don't tend to pass around raw pointers), so it is much easier for the compiler to reason about what objects might be in the same place, and which aren't.

The restrict keyword goes some way to fixing this in C, but is tricky yo use correctly, and also doesn't solve all problems.

wumpus · on Dec 19, 2016

Your Fortran summary is incorrect. F77-style arrays are generally passed as a raw pointer (under the hood.) F90-style allocatable arrays are passed as an array descriptor containing size information, again under the hood.

This doesn't affect aliasing in Fortran at all. Aliasing is simply prohibited, which leaves the compiler free to generate code that gets the wrong answer, if the programmer screws up and has actual aliasing at run-time.

CJefferson · on Dec 19, 2016

Apologises, you are right. It's so long since I did Fortran I had misremembered that the language made it impossible to alias, as opposed to it being undefined behaviour.

bmh100 · on Dec 18, 2016

What makes it tricky to use, and where does it fall short?

CJefferson · on Dec 18, 2016

There are some weird rules, like:

     mypair* restrict position   = &object->pair;
     float*   restrict position_x = &position->x;

Is undefined behaviour, as you can't take a restricted pointer to a child of a restricted pointer in the same scope (the rule isn't quite that simple, but that's part of it). In general, rules about pointers to pointers get complicated, and I've had trouble with them in the past.

In general, it's a good idea to try to only use restrict for "leaf" pointers, not intermediate objects, but that does block some optimisations.

There is also the obvious problem of people getting carried away, and marking something 'restrict' where they shouldn't have.

While 'restrict' is very easy to use on simple structures (like just int*), if you look around you'll find very little, if any, material about using it for complicated nested data structures.

Someone · on Dec 18, 2016

There's also the fact that restrict is only a promise to the compiler "this function will not be called with aliased arguments", not also a command "do not allow people to call this function with aliased arguments".

As far as I know Fortran, at least in its original form, enforced the latter (that was fairly easy to do, as it didn't have dynamic memory allocation, pointers or recursion (the compiler determined the location in memory of all variables))

wumpus · on Dec 19, 2016

Neither of your options is the actual rule. The actual rule is that when the program runs, there can't be any runtime aliasing.

You can totally call a subroutine with several arguments being the same thing, as long as you don't do anything bad inside the subroutine.

I'm aware of this because I tried to get the PathScale compiler team to add a warning for incorrect aliasing. That wasn't possible. But I did get a flag which made the compiler obey C aliasing rules for Fortran, and it was part of the "got the wrong answer? run with these flags and if you get the correct answer, your program is not standard-conforming" thingie.

Someone · on Dec 19, 2016

Please educate me and explain how that contradict anything I claimed, as I don't see it.

I claim it is only a promise, you claim you can call it with identical arguments. I don't see a contradiction; making a promise isn't the same as keeping it.

Similarly, your argument that it is fine to break that promise as long as you don't do anything bad inside the subroutine, I read as "you sometimes can get away with breaking a promise", but again, that's not the same as not making the promise.

What am I missing?

CJefferson · on Dec 19, 2016

The difference is subtle. Here is a (stupid) example:

    int chooseone(restrict int* a, restrict int* b, bool c) {
       if(c) return *a; else return *b;
    }

Here, we can pass the same pointer for a and b, because only one of them is ever accessed in any call to the function.

Someone · on Dec 19, 2016

Thanks. I reread http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1256.pdf, and initially didn't get that from the informal description of restrict given there. It says (section 6.7.3):

"An object that is accessed through a restrict-qualified pointer has a special association with that pointer. This association, defined in 6.7.3.1 below, requires that all accesses to that object use, directly or indirectly, the value of that particular pointer."

So, it hinges on the meaning of 'accesses'; it is a runtime thing, not a compile-time thing.

amenghra · on Dec 18, 2016

  > I would expect two compiled languages [...] to perform almost exactly the same when they're doing nothing but arithmetic operations [...]

Different languages are going to have different memory models which is going to result is vastly different performance. E.g. a GC vs reference counting vs linear types. If you are doing arithmetic on something that needs to be managed (e.g. an array), you'll see differences.

Some language features close the door for optimizations. E.g. the ability to dynamically add or remove methods to an object/class can make method dispatch more or less expensive.

Finally, some languages have undefined behavior (e.g. what happens when there's an overflow or a division by zero) while other languages want specific behavior for the edge cases. This all affects the performance of the resulting code, even for basic arithmetic operations.

supergarfield · on Dec 18, 2016

These are all good points.

In this case though, C doesn't have garbage collection or OO features, and (I believe) has unspecified behavior for arithmetic exceptions. I assume Fortran does the same (since it's comparably fast), so I guess I still don't really see where people expect the difference to come from.

wumpus · on Dec 19, 2016

C and Fortran both obey IEEE 754 and have well-defined arithmetic exceptions on most platforms -- but programmers usually instruct the compilers to ignore all that in the interests of faster code.

As for what the differences are, try talking to a compiler engineer. The better ones love how Fortran features like the argument aliasing rule for 100% of code makes generating fast code easier. In fact that's one of my tests for compiler engineers in interviews; if they've worked on a C/C++/Fortran compiler and can't say anything good about the challenges and opportunities of Fortran optimization relative to C/C++, then they haven't really drilled down on performance issues!

jstewartmobile · on Dec 18, 2016

Lies, damn lies, and benchmarks.

I know as a long-time C programmer, there are small changes I can make to source that will result in big-time performance gains.

Could hinge on how tight their FORTRAN code is vs their C code...

Could also depend on how much love their respective machine code translations have been getting.

dfgasfjhjhsdojj · on Dec 18, 2016

Fortran hasn't been all caps for nearly 30 years now...

It is very easy for a physical scientist to write Fortran that performs well - the syntax looks very similar to the maths that is being modeled. It is a lot harder for a scientist to write tight C code.

jstewartmobile · on Dec 18, 2016

Not bashing Fortran here, I'm defending it.

igouy · on Dec 19, 2016

> small changes … result in big-time performance gains.

Look and you might see measurements of several different C or C++ or … programs doing the same task with different performance.

https://benchmarksgame.alioth.debian.org/u64q/performance.ph...

guitarbill · on Dec 18, 2016

Like you observe, the compilers matter.

Historically, C compilers just couldn't do such a good job at optimising, because C is pretty complex and has loads of undefined behaviour. Also, Fortran compilers have always been optimised for numerical computation.

Recently * , a lot of effort has gone into making C compilers produce very optimised code. In general, this is a good idea because humans are bad at optimising and also remembering every damned Intel vectorisation instruction.

So historically, Fortran was much faster, recently * with compiler advances in e.g. reordering and AMD64, it's possible C compilers beat Fortran compilers.

* edit: adjust your definition of recently to encompass the entire timeframe of Fortran's existance

flamedoge · on Dec 19, 2016

Compilers are afraid to optimize around possible UB, even though they can.

wumpus · on Dec 19, 2016

Having worked with key people from the SGI compiler team, and then competed with Intel / PGI / Sun / gcc on x86, I totally disagree.

flamedoge · on Dec 19, 2016

I guess I should've looked when I said "Compilers". I only skimmed two and one of them is LLVM.

flamedoge · on Dec 19, 2016

I totally see UB as a blessing in disguise for optimizer.

wumpus · on Dec 19, 2016

Usually, yeah. The art is making it debuggable, when the user accidentally strays into UB and needs help not doing that.

throwaway7645 · on Dec 18, 2016

C can be faster than Fortan if you optimize it so. Fortran is built to run at very close to the max C speed for these things out of the box (as I understand it after doing some extensive googling a while back).

santaclaus · on Dec 18, 2016

> I would expect two compiled languages with reasonable compilers to perform almost exactly the same when they're doing nothing but arithmetic operations

Auto-vectorization? These implementations vary wildly between GCC and Intel.

dfgasfjhjhsdojj · on Dec 18, 2016

1. A program from 30 years ago will compile and run.

2. As a research student your first program is likely to be a F77 program written by your professor or an older colleague.

3. A library from 30 years ago can be easily wrapped into a modern program, compiled and run.

4. Arrays are really the most important thing for this sort of programming and since F90 array level operations have been baked into the language.

5. It is fairly trivial to write fast Fortran.

6. Fortran is much easier to use for numerical stuff than c++

troymc · on Dec 18, 2016

When I was doing my graduate studies in astrodynamics (spacecraft trajectory design) at Purdue University circa 2003, we worked closely with the Jet Propulsion Lab (JPL) and got to use a lot of their excellent astrodynamics code (used for missions like Galileo and Cassini). It was all Fortran.

I heard that there was an effort at JPL to convert over to using Python. I'm not sure how that went. Maybe they were just writing Python bindings for their Fortran code? I know that much of SciPy consists of Python bindings for Fortran and C code [1].

The software I used to do "nonlinear programming" (i.e. parameter optimization), called NPOPT, was also written in Fortran.

[1] https://www.scipy.org/scipylib/faq.html#how-can-scipy-be-fas...

mturmon · on Dec 18, 2016

The Python replacement succeeded - I slightly know the people involved, who got an award for the work. But I don't use the library so I don't know if the wrapper approach was used. That would be perfectly legit IMHO.

ChuckMcM · on Dec 18, 2016

"Early Fortran (designated in allcaps as FORTRAN)" that would be because it is an acronym "FORmula TRANslator".

Fortran continues to do that well. And as the article mentions scipy doesn't run as fast.

The logical successor to Fortran was APL (Arithmetic Programming Language) but its notation really killed it.

The properties that keep Fortran rolling (easy to start using, effective in translating formulas into computation) persist.

nikofeyn · on Dec 18, 2016

APL is named after Kenneth Iverson's book "A Programming Language".

thescribe · on Dec 18, 2016

I'm ignorant about FORTRAN, have there been attempts to modernize it? Maybe something like coffeescript/typescript?

ms013 · on Dec 18, 2016

You didn't read the article apparently. Fortran (which officially dropped all caps in the standard in 1990) has had new versions every couple years, up to 2013. Each revision of the language brings modern features into it, and old features go through a process of deprecation and then obsolescence. Most people seem to be unaware that it is an evolving language that has changed significantly since the standard most people are familiar with from 1977. I am a bit biased though: I work on a Fortran compiler, so I've had reason to stay on top of the standards as they come out.

thescribe · on Dec 18, 2016

I misread that everything was backwards compatible, my bad.

mhd · on Dec 18, 2016

In the 70s, when Fortran 66 still ruled the night, the usual suspects at Bell Labs wrote Ratfor, which was pretty much what Coffeescript did to Javascript, bending the syntax towards C.

When you compare the current Fortran standards to its '66 version, you'll see how much progress it made.

And I hope that you can actually use it these days. Round the turn of the millennium, it was still quite common for people to use Fortran 77 because that's all that was supported or fit in the general codebase.

h4nkoslo · on Dec 18, 2016

The latest major Fortran version, circa 2003, is at least as modern as something like C++.

bmh100 · on Dec 18, 2016

I would also like to add that there have been revisions in 2008 and 2015 [1].

[1]: http://fortranwiki.org/fortran/show/Fortran+2015

an_d_rew · on Dec 18, 2016

Fortran 2000 and beyond IS a very modern language.

radioactivist · on Dec 18, 2016

The author is I think being a bit unfair to C++ when it comes to matrix/array handling. Their example of things that would be difficult to do in C/C++:

A = B

A = 3.24 * B

C = A * B

B = exp(A)

norm = sqrt(sum(A * * 2))

Quoting:

"Similar C/C++ code simply does not exist ... Having to use libraries instead of intrinsic functions means the resulting code is never as neat, as transferable, or as easy to learn."

While this statement is true, I think strongly undersells how well these kinds of libraries can be "baked" into C++ through operator overloading, templating, etc. See for example, the Eigen library [http://eigen.tuxfamily.org/].

pbsd · on Dec 19, 2016

Not to mention all of that array behavior is available on the standard C++ library with std::valarray<T>, e.g., https://godbolt.org/g/JBSvuH. There was talk of a more thorough standard multi-dimensional array type, but I have no idea what's the status on that.

sampo · on Dec 19, 2016

Gotta admit that array slicing syntax is still friendlier in Fortran

    a = [1,2,3,4,5,6]
    a(1:3) ! gives [1,2,3]

than with std::valarray

    a = {1,2,3,4,5,6};
    a[slice(0,3,0)]; // gives {1,2,3}

And I couldn't even find examples on how to do slicing with multi-dimensional arrays in std::valarray?

In Fortran it's just

    a(1:3,1:3)

pbsd · on Dec 19, 2016

std::valarray does not have multidimensional support at all. In fact, std::valarray is a relatively abandoned part of the standard library. You'll have to go with one of the 3rd party libraries for better support.

My point was not so much that valarray is awesome, but that articles like this one tend to fight a very strawman version of C++, even when just sticking to strictly standard components.

KKKKkkkk1 · on Dec 19, 2016

If anything, saying that C++ has no support for linear algebra is giving it too much credit. C++ has an abundance of libraries that are broken in various strange and marvelous ways. Typically, you find out about the brokenness only after you made the investment of learning the library and put it into production, so that there is no way back. Your comment about valarray is a good example of this.

pbsd · on Dec 19, 2016

That's a criticism I'm entirely on board with, and is not limited to linear algebra libraries. HTTP client libraries are worse, there's not a single good one.

m_mueller · on Dec 19, 2016

here's the problem: These kind of libraries will never come close touching Fortran's built-in features, simply because the compiler can't automatically optimise for knowledge about the code that only the library has. Even arrays not knowing their size is still an issue, let alone array based ops like shown in the article.

Fortran is not the fastest language around the block, but if you don't want to go into non-portable machine-based hand-tuned optimisations (e.g. loop unrolling) it's hard to beat. Basically, if you optimise for both runtime and programmer time for HPC applications I'd argue that Fortran is top of the crop.

whyever · on Dec 20, 2016

I don't think this is true. C++ matrix template libraries are competitive with anything that Fortran can do.

radioactivist · on Dec 19, 2016

Well, in the case of Eigen it's an entirely header based library, so the compiler can optimize some things. Not to mention the template based meta-programming strategies [1] Eigen itself uses to simplify and optimize during compilation (including vectorization and loop unrolling). The fact that this kind of infrastructure can be built up in a library in C++, i.e. it doesn't have to be baked into the compiler, is a huge plus I think relative to Fortran.

[1] https://eigen.tuxfamily.org/dox/TopicInsideEigenExample.html

pif · on Dec 18, 2016

> If a function that takes a ‘real’ is fed a ‘const real’, it will return an error. It is easy to imagine how this can lead to problems with interoperability between codes.

What's he talking about? This is wrong!

spapas82 · on Dec 18, 2016

One interesting fact that supports the point of the article is that Matlab is actually using the Fortran code for its implementation of special functions https://www.mathworks.com/matlabcentral/fileexchange/6218-co....

The amount of existing scientific code in Fortran is huge and there's a chicken and egg situation (new code uses/extends old code so new code will also be in Fortran) that will lead to further increasing it!

spott · on Dec 18, 2016

Matlab also uses MKL (written in fortran, with a heavy dose of assembler) for its blas routines I believe.

jstewartmobile · on Dec 18, 2016

It's funny that someone has to write an article to defend modern FORTRAN to people who write in JavaScript, which is the best proof we programmers have of entropy.

taejavu · on Dec 18, 2016

It's funny that in every discussion of programming languages that have nothing to do with Javascript, there's always some smug bastard bashing it.

jstewartmobile · on Dec 18, 2016

Because it's really that bad, and we all have to use it because somehow, Brendan Eich's cocktail-napkin language became the VM of the web.

BrendanEich · on Dec 19, 2016

"cocktail-napkin", I wish! It implies there were cocktails in my Netscape dorm (no time).

https://news.ycombinator.com/item?id=13149123

jstewartmobile · on Dec 20, 2016

The time-crunch is a given, but the c-level meddling... you could have done with a cocktail or two!

Best of luck with Brave!

taejavu · on Dec 18, 2016

Yes but if you go into a discussion of Star Trek movies complaining loudly about how bad you think the new Star Wars is, you're completely irrelevant at best.

andrepd · on Dec 18, 2016

Yes, and what does that have to do with this article? If you come to a discussion about thing x to complain about thing y, no matter how valid your complaint you are out of order.

To quote The Dude: "You're not wrong, you're just an asshole."

Annatar · on Dec 19, 2016

Yes, and what does that have to do with this article?

Excellent question, glad you asked! Just what do you think all the people who are asking why Fortran is still in use are using?

My money is on the bet that they're not using Fortran. I've worked in the finite element analysis / high performance computing fields for quite a few years and I've never heard anyone who used Fortran complain that they were, or had to use it. Could it be that it's because people like that they get high performance from a language out of the box?

jstewartmobile · on Dec 18, 2016

They're both languages. Just pointing out the irony of this guy taking the time to explain why Castro isn't such a bad guy to a world of Stalinists.

Annatar · on Dec 19, 2016

Because JavaScript is not very well thought out, by the JavaScript author's own admission. If the father of the language says he just hacked it together, there isn't anything to argue about.

PaulHoule · on Dec 18, 2016

Even if you are using python, R or java you are still best off linking to matrix subroutines in fortran if you want to solve or diagonalize, get eigenvalues and such. This is not just about raw performance but also correctness, ecology of roundoff errors, etc.

ricree · on Dec 18, 2016

For Python, at least, the most popular numeric computation library (NumPy) uses a fair bit of Fortran under the hood. I'm not certain whether the matrix operations do, but it certainly wouldn't surprise me.

nerdponx · on Dec 18, 2016

Doesn't NumPy just use BLAS and LAPACK for matrix operations?

ThatGeoGuy · on Dec 18, 2016

The vast majority of BLAS and LAPACK are written in Fortran (at least, the ones provided by netlib.org). Github also reports that OpenBLAS is around 50% Fortran, so there's quite a bit there too [1]. Usually when you call BLAS functions from C you're really calling a cBLAS wrapper to some Fortran function. This is generally why you see -lgfortran -lf77blas when you compile programs that require numerical tools.

[1] https://github.com/xianyi/OpenBLAS

simonster · on Dec 19, 2016

At least in OpenBLAS, most (all?) of the BLAS routines are C functions that call kernels written in assembly, but many of the LAPACK routines are the Netlib reference Fortran versions.

qznc · on Dec 18, 2016

Yet, I root for D and its GLAS library [0]. It is a young project, but it beats the usual suspects for single-threaded matrix-multiplication. Stuff like "solve or diagonalize, get eigenvalues and such" is not written yet. It is very promising for a one-man project, though. We'll see in 10 years, I guess.

Fortran certainly has a mythical reputation for performance, but it should be challenged every once in a while.

[0] http://blog.mir.dlang.io/glas/benchmark/openblas/2016/09/23/...

pducks32 · on Dec 18, 2016

As a physicist please have faith in us. We don't all use Fortran. I use Ruby as much as possible and it's fun because it forces me to write libraries that mimic ones that are found in other languages. So I end up learning a lot.

yxhuvud · on Dec 18, 2016

I can recommend looking at Crystal if you have some time over. Very Ruby-like syntax, but compiles down to quite efficient native code.

bmh100 · on Dec 18, 2016

How are you using Ruby for physics?

spullara · on Dec 18, 2016

When I was in undergrad I did some computer simulation work for my professor. He programmed in FORTRAN while I had been trained in C/C++. I did my simulations in C/C++ and I also wrote some libraries for his FORTRAN code so he could have more dynamic configuration files. At the time I thought that FORTRAN was an anachronism and was surprised that it was still being used in physics.

This was 1992.

santaclaus · on Dec 18, 2016

What ever happened to the effort to bring Fortran to LLVM (from Los Alamos I think)?

lightsighter · on Dec 19, 2016

Still in development (paid for by LANL, done by PGI):

https://developer.arm.com/-/media/developer/developers/hpc/f...

laplace2 · on Dec 18, 2016

I've been looking for a Matlab alternative and been disappointed with either the performance or the syntax of Octave, Python, and Julia. I'd known Fortran had modern features and updates every few years, but never tried to pick it up since the excitement seems to be elsewhere. This article is convincing that Fortran is worth trying out.

Xcelerate · on Dec 18, 2016

What don't you like about Julia's syntax? Is it more than the 1-based indexing and "end" statements?

goatlover · on Dec 18, 2016

Agreed, Julia syntax is very clean. Also, you can make it look more math-like with it's support for unicode variable and function names.

bmh100 · on Dec 18, 2016

If you want to get down into the nuts and bolts of numerical computation, with convenient concurrency, you should definitely look into Fortran + OpenMP.

hcrisp · on Dec 18, 2016

FORTRAN may be faster but one graph shows Python recently overtaking it in mentions of Astronomy papers [0].

[0] https://mobile.twitter.com/astrofrog/status/7870072618771660...

arcanus · on Dec 18, 2016

A great deal of that is for post-processing, however. Very little python used in the compute heavy pieces of scientific computation.

maus42 · on Dec 18, 2016

I know a person who works in atmospheric physics group. All the heavy physics simulation and modeling happens in Fortran, and then they do the "lighter" analysis, summary statistics, plots, visualizations, etc with a Python-based stack.

Asooka · on Dec 18, 2016

I have one issue with the article. The way to dynamically allocate a multidimensional array in C is the following:

  size_t m, n;
  ... set m and n ...
  int (*a)[m][n];
  a = (int(*)[m][n])malloc(sizeof(*a));

Edit: and you MUST NOT REUSE OR OTHERWISE CHANGE m and n FOR THE ENTIRE LIFETIME of the array, because bad things will happen.

And you access it via

  (*a)[x][y]

And you free it with

  free(a);

And if you want to pass it to a function:

  void do_with_array(void* raw_array, size_t m, size_t n) {
    int (*array)[m][n] = (int(*)[m][n])raw_array;
    ... do whatever ...
  }
  do_with_array(a);

And I'd rather do it in Fortran, but if you ever need to do it properly in C (i.e. without using array of pointers) - now you know.

acqq · on Dec 18, 2016

If I understand, your solution, as it is, would still be wrong to be passed to the Fortran code as in Fortran the column index is faster changing. And if you would blindly translate some FORTRAN code to C with the above arrays, even if you'd get the same results the C one would be slower because of that, as the FORTRAN code could have been written with the knowledge of which index arrangement is faster.

cygx · on Dec 19, 2016

You can trade some safety for syntactic convenience by using pointers to the subarrays instead of a pointer to the multidimensional array, eg

    void do_with_array(size_t m, size_t n, int a[static m][n]) { ... }

    int (*a)[n] = calloc(m, sizeof *a);
    do_with_array(m, n, a);

SFJulie · on Dec 18, 2016

Technically you are right.

Practically you are right, but the article is more right than you. I got into some fights with quite a few elite teachers/coders teaching the way of doing 2D arrays exactly as described, and they wouldn't agree the method you describe is less error prone ... because: it is tradition. (And as every person doing research know tradition is not supposed to change and that what science values).

It seems absurd, but so many things are absurd these days I am not even surprised.

Asooka · on Dec 19, 2016

Oh, the method I describe is more error prone than array of arrays, because if you ever modify the two variables you used to specify the array size, the compiler will just use the new values for array dimensions. If I had to do array of arrays though, I'd at least do something like

  template<typename T>
  T** make_2d_array(size_t m, size_t n) {
    // TODO: add padding for alignment,
    // but you're unlikely to need something aligned
    // on a larger extent than sizeof(void*)
    void *memory = malloc(m * sizeof(void*) + m * n * sizeof(T));
    T** array = (T**)memory;
    T* data = (T*)((char*)memory + m * sizeof(void*));
    for (size_t i = 0; i < m; i++) {
      array[i] = data + i * n;
    }
    return array;
  }

And you can then delete the entire array with a single free().

AlexeyBrin · on Dec 19, 2016

Except that as of C11 variable length arrays (VLAs) are optional. So with a fully standard compliant C11 compiler your code could potentially fail. While the old way of using arrays of arrays will work with any C compiler.

(Not that important if you use a pure C compiler, but your code will also fail if you use a C++ compiler.)