For late arrivals to this thread: note that the illustration being discussed has been updated (see the Wayback Machine for the old version). The new version probably still does not show the best second-order approximations, but the obvious qualitative errors have been corrected.
Coming from the type theory side with only a passing glance at Ada, I am nevertheless sure: this is not what type theorists mean when they talk about dependently typed languages. Such languages derive from the formulation of Per Martin-Löf (also called Intuitionistic Type Theory), they include dependent sum and dependent product types, and they allow type checkers to prove complex statements about code. (The history of dependent types is intertwined with the history of formalizing mathematics; dependent types were designed to encode essentially arbitrary mathematical statements.)
The interesting feature of Ada here seems to be what it calls "subtype predicates". As you've explained, these come in a "dynamic" flavor, which are a nice syntax for runtime assertions, and a static flavor, which are compile-time checked but restricted to certain static expressions (per https://ada-lang.io/docs/arm/AA-3/AA-3.2#p15_3_3.2.4).
An example of something you can do in a dependently typed language is write a sorting function in such a way that the type checker proves that the output will always be in sorted order. I am pretty sure this cannot be done in Ada; checking at runtime does not count!
I do believe (having heard from multiple sources) that Ada's type system was ahead of its time and its success in creating practical programs that are likely to be correct is probably underrated. But I'm not here just to legislate semantics; one should be aware that there is something vastly more powerful out there called "dependent types" (even if that power is not likely to come into most people's day-to-day).
(Unfortunately Wikipedia is quite poor on this topic; you will see, for example, that on the Talk page someone asked "Is Ada really dependently typed?" two years ago; no reply. And it makes no sense to say that Ada has "tactics" but not "proof terms"; tactics are a way of generating proof terms. There are many better resources out there (especially ones associated with the languages Agda, Coq (currently being renamed Rocq), and Lean, e.g. https://lean-lang.org/theorem_proving_in_lean4/dependent_typ...). But be warned, there is no "short version": dependent types cannot be explained in a sentence, and they are not something you will arrive at with enough "hacking away".)
> An example of something you can do in a dependently typed language is write a sorting function in such a way that the type checker proves that the output will always be in sorted order. I am pretty sure this cannot be done in Ada; checking at runtime does not count!
It actually can be done in Ada, but not purely with the type system, instead we rely on SPARK, which converts Ada code and passes it through various automatic theorem provers. Some examples of fully proven sorting functions are here: https://github.com/AdaCore/spark2014/blob/master/testsuite/g...
You can also see from the above code just how good theorem provers and SPARK are now with the reasonably low number of assertions required to both prove that the output is sorted and prove that the input and output contain the same elements, not to mention all the hidden proofs relating to integer overflow, out-of-bounds access, etc..
You could maybe do all this with types and SPARK, but it's not the approach that would usually be taken.
Ah, very interesting. It does seem that the Ada community has done serious engineering work to build in powerful formal verification, in a way that is somehow parallel to the (much slower, for practical purposes, if more elegant) arc of type theory...
what is the flow for working through this kind of proof? Is there an interactive proof mode like you find in a lot of dependent type provers? Or is there some other guiding mechanism for telling you that you haven't provided enough guidance with asserts?
Do note however that the "proof" part of dependent types requires being able to evaluate arbitrary parts of the program at "compile time". (As a fact of the matter, there isn't even a clean phase separation between compile time and run time in dependently-typed languages; the distinction can only be reintroduced after-the-fact via "program extraction".) So, in a sense, types may depend on values in a dependently-typed language but this is merely an elaborate form of meta-programming, it need not establish a true dependence from runtime values. Whether Ada qualifies as a 'true' dependently-typed language, in this view, would depend on how strong its forms of guaranteed compile-time evaluation and meta-programming are.
It does look like the "discriminant" system of Ada shares key similarities with what dependently-typed languages call a "dependent sum", a generalized record type where "later" elements of the record can depend on "earlier" ones.
"Dependent products", on the other hand, can be viewed as an extended version
of generic functions, although they may also suffice to account for e.g. the examples given by OP of Ada's runtime range types and runtime-sized arrays. The key here being that the representation of a type is indeed given as a "function" of the value parameters that are "depended" on.
Looks really difficult to prove even a "hello world" algorithm. I'm afraid you can easily run into the problem of not understanding what you're proving and just not doing it for what you would actually want.
What’s nice is that you can do it in steps - you may have a hard time proving full specification, but you can prove absence of bad behavior like buffer overruns, etc and go from there.
Maybe they're not implying this kind of limited dependent type system but surely it is still dependently typed? It's just not the "full fat" dependent typing.
Another example of a language with limited dependent typing is Sail. It has "lightweight" dependent types for integers and array lengths (pretty similar to Ada from what it sounds like).
It's very good in my experience - it lets you do a lot of powerful stuff without having to have a PhD in formal verification (again, sounds similar to Ada).
> An example of something you can do in a dependently typed language is write a sorting function in such a way that the type checker proves that the output will always be in sorted order.
Yeah you can't do that but you can have the type checker say things like "n^m is positive if n is positive or even" or "foo(x) -> (bits(n), bits(m)) with m+n=x" (not the actual syntax). I'm pretty sure you can't do that stuff in a type system without dependent types right?
Well, "dependently typed" is widely used to mean something like "derived from Martin-Löf type theory, including arbitrary dependent sums and dependent products"; in other words, "dependent types" means "full fat dependent types", and it's the things that are less powerful that require qualifiers.
(So when Sail says it has "lightweight dependent types", that seems fine to me (it does seem to do more than it could with simple types or polymorphic types), but if it simply asserted that it "had dependent types" I would feel misled.)
The wording is subtle and language does change, but what I want to push back on is the confusion I see from time to time that "if I can write anything that looks like a function from values to types, I have the same thing that everybody talking about dependent types has". If you think this you don't know what you're missing, or even that there is something you're missing, and what you're missing is very cool!
Would you say Lean is a somewhat learnable language for somebody who only has cursory exposure to functional programming and static types? I’ve almost exclusively used typescript for the last few years, except for some clojure in the last few months.
Sometimes I find a neat language, but my very TS-oriented brain has a hard time getting into it.
I would say dependent types are going into the deep end; unless you have a real need to prove things, it may be hard to see the motivation to learn such abstractions.
In between ad hoc types like TypeScript and dependently-typed languages like Agda, Coq/Rocq, and Lean are well-typed, polymorphic (but not dependent) languages like OCaml, F#, or Haskell ("ML family" or "Hindley-Milner" are related terms). Those are what I'd suggest checking out first!
If you’d like to dive into the deep end and learn a “we tried to make as much type system power available as conceptually simply and elegantly as possible” language, Agda [1] is a very fun language to fool around with. It’s superficially a Haskell-like, but you’ll likely only really learn it as a toy that nevertheless blows your mind… like Prolog etc.
The first thing I tried to do after learning all the basic syntax is write a function f: Nat —> Set (from numbers to types), and then stick it in a function signature (i.e. g: f (3).)
> you're not going to be able to cover all valid families that way
In fact the emoji committee backpedaled on family permutations for exactly this reason, and now recommends (exactly as you suggest) "symbolic" family glyphs and juxtaposition of existing people-emoji to describe families in detail.
Even the new proposal still has problems. It still has all of the variations between 1 and 2 parents and 1 and 2 children. IMHO it should all be collapsed down to just two Emoji:
1. Two people without children
2. Two people with children.
The former being short for "just the parents" or childless couples and the latter encompassing all families with kids regardless of the numbers. This is a compromise answer, but I think it serves the purpose best given the intended use of emojis.
And even better solution would be to figure out some iconography that would denote "family" without explicitly depicting the people, but I'm at a loss on that one. I mean the people are what families are all about, it's hard to divorce the concept.
"Free algebra" here comes from universal algebra/category theory, which is quite distinct from the "free algebra" of ring theory (having nothing in particular to do with sums and products). Unfortunately "algebra" might be the most overloaded term in mathematics.
(The article says in the second paragraph: "the name comes from universal algebra". The wiki article you linked disambiguates itself with universal algebra right at the top.)
"Algebraic data type" in this sense is synonymous with "inductive data type".
The values of an algebraic data type do not form a free ring or any ring. You cannot generally add or multiply them. You may write a particular type with constructors intended to serve this purpose, but the laws defining a ring will not hold.
A separate fact is that types (not their values) belong to a free semiring "up to isomorphism" (where sum and product are the categorical coproduct and product). It is this fact that seems to have been conflated with the "algebra" of "algebraic data types". The disambiguation of this with the sense from universal algebra is the very purpose of the article.
In Rod Burstall's 1980 paper titled, "ELECTRONIC CATEGORY THEORY", he explores "using algebraic and categorical ideas to write programs", and explains how his interest in "using algebraic ideas in programming goes back to work with Peter Landin, who first interested me in universal algebra and category theory" in the '60s.
Also, you have to love this introduction:
> If we can have electronic music, why not electronic category theory? 'Music has charms to soothe the savage breast', said Congreve Has category theory less charm? Can we not make the electrons dance to a categorical tune?
You have to love his whimsy.
I've been playing with lately Maude (Based on OBJ and Clear) and it really seems like the motivation for algebraic data types has been about driving categorical algebras / universal algebras from the beginning. At some point it clicked how it felt like I was defining categories (modules), objects (sorts) and morphisms (ops) and then adding constraints writing equations.
I don't know Maude or category theory well enough to say it qualifies as an "electronic category theory" for some defined set of categories (small?), but I can see that vision, and how it's a little disappointing this vision isn't better understood.
FYI, Burstall sought out Jim Thatcher from the ADJ group and Thatcher, who referred Burstall to Goguen (Thatcher was focused on stopping the Vietnam war), who was the ADJ group's practicing logician / category theorist. From what I read, the reason Goguen was even at the ADJ group in the first place is MacLane recommended him for the position while Goguen was studying under him in Chicago.
When you consider the timeline in 1977 when Burstall/Goguen first met. They figured out the semantics for this language very quickly, define institutional model theory, which formalized a minimal definition of "what is a logic?" and then used that as the basis for creating Clear, which inspired Burstall's Hope and Goguen's OBJ.
The fact that they did all that in such short order is very telling (IMO) for how long they had been each been concocting schemes to get universal algebras and category theory into computer science.
If you start with the phone upright and rotate the screen away from you by turning the phone around the vertical axis, then both rotations are around the same axis and of course they do commute.
My guess is that romwell is holding the phone flat, so that the rotation away from you is about a horizontal axis; then you should experience the noncommutativity.
(The resulting orientations are 180 degrees apart, which indeed makes it difficult to say that any one orientation should be the unique average. But this is due to the geodesic structure of the space of rotations, not the noncommutative product that happened to construct these points, see above.)
The most widely-used concept of "average" is surely a point that minimizes the sum of squared distances to each of a list of input points. Distances are canonically defined in the space of rotations, and so are averages in this sense (not always uniquely).
Commutativity has nothing to do with this; do not confuse the typical formula for averaging with the reason for doing so! Of course, there are other senses of "average" (which generally do continue to apply to the space of rotations as well).
The application for this given by GreedCtrl's reference is to spline interpolation. Another is in robotics, when combining multiple noisy observations of the orientation of an object.
>Distances are canonically defined in the space of rotations
I am sorry, but this is simply not true.
There are many distance/metric definitions that are applicable to the space of rotations, and the best choice of metric is defined by the application, which is why I asked that question.
None of them is any more "canonical" than the other. See [1][2][3] for an introduction and comparison.
One will find there at least four "canonical" distance definitions, and applications ranging from optometry to computer graphics to IK (which is what you referred to).
>The most widely-used concept of "average" is surely a point that minimizes the sum of squared distances...Of course, there are other senses of "average" (which generally do continue to apply to the space of rotations as well).
I know this, not all of the readers may.
What I don't know is what context the parent is coming from.
Maybe all they need is interpolating between two camera positions - which is a much simpler problem than the paper they found (and what we're discussing) is addressing.
>The application for this given by GreedCtrl's reference is to spline interpolation.
It is not clear that the reference that they have found is actually the best for their application - they only said it was something they found, and that the article we're discussing looks "simpler" for their level of mathematics.
The article we are discussing does not provide any means of "averaging" any more than two rotations, though, which motivated my question.
The bi-invariant metric as pointed out by chombier is what I have in mind. I agree that a non-canonical metric may be the right one for some applications, but those are the exceptional ones. The bi-invariant metric (which has a simple, geometric meaning given by Euler's rotation theorem) is the right starting point for thinking about distances in this space.
(Your reference [2] supports this point of view: read "simplest" as "canonical". Your reference [1] claims five distinct bi-invariant metrics, but this is wrong. The argument given is that any metric related to a bi-invariant metric by a "positive continuous strictly increasing function" is itself bi-invariant, which is not true.)
> >Distances are canonically defined in the space of rotations
> I am sorry, but this is simply not true.
It is true, there is a canonical choice given by the bi-invariant Riemannian metric on compact Lie groups, such as rotations (in this case the shortest angle between rotations)
Whether or not you want this metric in practice is another problem, of course.
> The article we are discussing does not provide any means of "averaging" any more than two rotations,
The Karcher/Fréchet mean described in the original article does average more than two rotations
Wiki says that curl is standard in North America, while rot is common in "the rest of the world, particularly in 20th century scientific literature". As a North American I can confirm that I was always taught curl and only saw rot in older books.
That said ∇× is what I've seen most commonly overall.
Ditto 1980's era Australian physics and engineering courses.
Throw in a right hand thumbs up for "direction" of curl (fingers indicate rotation, orthogonal thumb direction is orientation) and other results about paths having to have a zero rotation between places with opposing rotation, etc.