I understand the appeal of array languages — they provide very convenient shortcuts to common loop constructs in traditional imperative scalar languages. Their usefulness is manifest in the popularity of NumPy (and its derivatives, e.g. TensorFlow, PyTorch, JAX, etc.), MATLAB, Julia, R, and other Iverson Ghosts [0]. The rise of map/reduce/filter operations in common scalar programming languages is also testament to array languages' usefulness.
I understand (though do not personally agree with) the appeal of extreme terseness — there are arguments to be made that maximizing information density minimizes context switching, since it reduces the amount of scrolling. Personally, I find that large displays and split buffers mitigate this issue, whereas the mental overhead of using my short-term memory to map dozens of single letter variables to semantic definitions is much higher than having to flip between splits/marks in my text editor. (The fact that the aforementioned Iverson Ghost languages are popular whereas APL and its derivatives are not is evidence that I'm aligned with most people in this regard.)
I don't understand why people rarely make the terseness argument for non-array languages, even though it's just as easy to write tersely in them — the Obfuscated C Competition is a prime example [1]. Is it just due to the influence of APL, or is there something special about array languages that gives them a unique proclivity for terseness?
One thing humans are good at is pattern recognition. Terse APL is - once you are used to it - very recognizable.
Many constructs take less characters to algorithmically specify than to name: (Examples in K because that’s what I know best):
(+/x)%#x computes the average of a vector x; or the average of each column in a 2D matrix x; or other averages for other constructs. It takes about as much characters to spell “average”, which is considers too short a name in modern C or Java - and yet, the code is instantly recognizable to any K programmer, needs no documentation about how it deals with NaNs or an empty vector or whatever (which your named C/Java/K routine would - does it return 0? NaN? Raise an exception? Segfault?)
And ,// (yes - that’s comma slash slash) flattens a recursive list. Way shorter than its name , and it’s the entire implementation.
Are these the most numerically stable / efficient ways to average or flatten? No. But they are the least-cognitive-load, fastest-to-grasp-and-pattern-match when reading code. Once you are used to them.
The appeal of Iverson languages also comes from a good selection of primitives. Most modern languages such as C++, Python, Nim, even Rust have an implicit focus on the “meta” programming: they give you the tools (templates, macros, classes) to build abstractions, with which you later build your actual computation. K / J / APL / BQN expect you to do the computation with much fewer abstractions - but provide primitives that make that incredibly easy.
For example, there is a “grade” primitive which returns a vector that - if used to index your original list - would sort it.
Now, say you have a list of student names, and ages, and you wish to sort them - once alphabetically, once by age. In idiomatic C++/Python etc, you’d have a “student” class with three fields. Then you’d write some comparator functions to pass to your sort routine. (I am aware of accessors and the key arg to pythons sort; assume for a second they aren’t there).
In K/APL/J, you’d just have 3 lists whose indices correspond: and then it is just:
name[<age]
Read “name indexed by grading of ages”. They’re a terser version: name@<age read: “name at grade of age”
The terseness compounds. Once you are used to it, every other programming language seems so uselessly bloated.
None of these things apply to obfuscated or shortened C.
Arthur released K source code, which is C written in the same style. It does not have the same appeal.
These are all great examples of the advantages of array programming syntax versus imperative scalar programming syntax, but do not IMO demonstrate the advantage of terse array programming syntax.
For example, the NumPy equivalents of your examples are not materially longer than their APL/J equivalents, but are easily readable even by people unfamiliar with NumPy:
> the average of a vector x; or the average of each column in a 2D matrix x
x.mean(0)
or, to use your example verbatim,
x.sum(0)/x.shape[0]
> flattens a recursive list
x.ravel()
> name indexed by grading of ages
name[age.argsort()]
Though for this application, you’d probably be using a dataframe library like Pandas, in which case this would be
You didn’t address points I already raised about mean() - how does it handle an empty vector? How does it handle NaNs ? You have to read the documentation to figure that out. In K, the implementation is in front of your eyes; NaN handling follows from “over” / reduce / addition semantics; and empty vector from reduce and division by zero semantics. It is all consistent by construction, and follows from basic properties. The same cannot be said about mean() or sum().
The call to .ravel() is strictly less powerful than ,// which would flatten a matrix, but also a lisp/xml style recursive list structure. And it is the actual implementation, not some weird name! It is “join over until convergence”.
With respect to sorting, in K you would also likely use the built in relational operator “?” (select).
Notice how you need to import pandas and numpy, and then know their docs well to find the routines you want and how they behave in edge cases? And that’s in addition to actually knowing Python?
K has all of that built in. You just need to know the basics (which takes more work than knowing Python well, admittedly). Most from there is derived by construction. It does have some 80 or so non-trivial primitives, but then you need much fewer libraries, often none.
(And, that’s not a for/against thing, but … in case you wonder, the K executable does that in about 200K binary without dependencies; REBOL achieves similar terseness of final programs by completely different means and philosophy, and also packs that into a 400K executable)
The point is that every idiom beagle3 noted is a simple and straightforward combination of general building-blocks, whereas nearly all of your "equivalents" are a one-off special-cased feature or function that needs to be learned on its own. The power and expressiveness of APL-family languages comes from the fact that they have a very small number of well-chosen parts that can be combined in flexible ways. Those patterns of combination become a higher-level vocabulary that fluent programmers grasp at sight, much as experienced readers of English learn to recognize the shapes of entire words at a time. This type of visual pattern recognition is facilitated by brevity.
APL-style idioms are not at all comparable to functions on a class or within a library, because idioms are self-describing in their entirety, requiring only an understanding of the primitive operators of the language, whereas a named function obscures and subordinates detail.
Array languages don't just have shorter tokens, they have fewer of them. A small token set is practical because your language's users can only keep a finite number of things in their working memory.
This constraint leads to symbol overloading. But careless, rampant overloading results in the same problem - too many things to remember. So you have to constrain your overloads.
With these constraints, if you want to design a practical, usable, general-purpose language (without forcing users to define every useful thing themselves), you have to choose composable abstractions. Prioritizing a single data structure (arrays) lets you focus your design effort and historically has good mechanical sympathy with the available computers, but there could just as easily be an "APL, but for associative maps" type language.
My point is "good terseness" comes from a holistic design approach, and simply making a bad language more terse will makes its flaws more obvious.
I think it's because obfuscated C is not nearly as succinct as the modern array languages. The formula for average in APL is just a few characters. I'm guessing the C equivalent would be a lot more and when you try the same for a real application, you'll have a very big overall difference.
I understand (though do not personally agree with) the appeal of extreme terseness — there are arguments to be made that maximizing information density minimizes context switching, since it reduces the amount of scrolling. Personally, I find that large displays and split buffers mitigate this issue, whereas the mental overhead of using my short-term memory to map dozens of single letter variables to semantic definitions is much higher than having to flip between splits/marks in my text editor. (The fact that the aforementioned Iverson Ghost languages are popular whereas APL and its derivatives are not is evidence that I'm aligned with most people in this regard.)
I don't understand why people rarely make the terseness argument for non-array languages, even though it's just as easy to write tersely in them — the Obfuscated C Competition is a prime example [1]. Is it just due to the influence of APL, or is there something special about array languages that gives them a unique proclivity for terseness?
[0] https://dev.to/bakerjd99/numpy-another-iverson-ghost-9mc
[1] https://www.ioccc.org, https://github.com/ioccc-src/winner/blob/master/2020/ferguso...