Isn't a join of two tables essentially the product of the two tables?
That said it does appear that sums are missing. Infact it seems like every mainstream language manages to fail to add sum types for some strange reason and you have to recreate them from scratch.
An unconstrained `join` is indeed the product of two queries, and a `union all` can be used to get their sum as well. This is in the query language. But what I need is sums and products in the table language.
There is a functor from the category of tables to the category of queries, which maps every table `foo` to the query `select * from foo`, however, a table isn't the same thing as a query: table fields may contain user-entered data, whereas query fields are always computed from table fields.
To give a concrete example, consider a database with two primitive tables, `male` and `female`. A primitive table is “independent” in the sense that you can add or remove rows from it at will. In other words, a primitive table behaves just like a normal SQL table.
Now define the derived table `person` as the sum of `male` and `female`. Because `person` is derived, you don't explicitly add or remove rows from it. Instead, every time you add or remove a `male` or `female`, a corresponding `person` also gets added or removed.
What I want is the ability to add the field `name` to the `person` table directly, without it existing in either `male` or `female`. You can't do this in SQL. The situation is similar for products.
> An unconstrained `join` is indeed the product of two queries, and a `union all` can be used to get their sum as well. This is in the query language. But what I need is sums and products in the table language.
The "table" language of SQL (DDL) includes pretty much the the entirety of the "query" language (DQL) through view definitions.
Of course, to do what you would really want to do to use relations to implement product and sum types, you needed materialized views with appropriate unique indexes for the candidate keys (and, ideally, auto-deriving the keys for product/sum tables from those of the base tables -- for product tables you can just concatenate the keys of the base tables, for sum tables you need the keys from the base tables to be equivalent and then to have an additional column that uniquely maps to the source table.)
A materialized view cannot contain user-editable fields unless these fields actually come from some actual table. Hence, a materialized view isn't a table in its own right, let alone a sum or product table.
> A materialized view cannot contain user-editable fields unless these fields actually come from some actual table.
"User-editable" is actually superfluous; this is true (or false) of read-write attributes in exactly the same way as it is of read-only attributes.
> Hence, a materialized view isn't a table in its own right, let alone a sum or product table.
You seem to be use "sum or product type" and "sum or product table" in somewhat unusual ways. Upthread, you suggested that sum and product tables were simply realizations of sum and product types by way of tables; but sum and product types have domains that are, respectively, the sum or product of the domains of the set of types each is based on, they don't include additional data.
The kind if augment sum or product relation you seem to be referring to can be achieved in a relational database through views (including, to the extent useful to the application at hand, materialized/indexed views), where the "base" sum or product type is a materialized view as described in the grandparent comment (including the described indexes), and the additional fields are supplied through a related table with a foreign key constraint (the augmented sum/product being represented with a view that joins that table to the base sum/product view.)
In general, to say that a category has sums and products, these sums and products must be objects of the same category. The axioms for a category are totally agnostic to the concrete nature of its objects. Just because Hask objects are types, it doesn't mean objects in other categories are Haskell types of behave like Haskell types. In the context of databases, it makes sense to treat a schema as a category whose objects are its tables, and whose morphisms are chained foreign key traversals. And tables contain fields that carry data of their own.
I still don't understand. What you call derived tables seem like views to me. They always need to be composed of primitive tables. So there will be a primitive `name` table and the person view will be `name * (male + female)`... Admittedly everything will be in 6NF but it still seems doable. Is there a limitation that I'm missing?
Sorry for the delay. I wasn't really thinking of making a primitive table just for names - what would its primary key be anyway?
The limitation you're missing is that SQL doesn't let you readily associate a user-entered `name` with each `person`. The best you can do is put a `name` field in the `male` table, then another `name` field in the `female` table, and use both `name`s when defining a `person` view. In my opinion, this is inelegant.
It isn't apparent from my identifiers, but, in my `existing.sql`, `person`'s real primary key isn't just `person_id`, but rather `(gender, person_id)`.
The way you've handled it, now you have an invariant to maintain that `male_id`s and `female_id`s don't collide. If you want to define arbitrarily many sum tables in your database, this can be really hard to enforce. My `proposed.sql` doesn't have such a problem.
Product tables are missing too. Let's say you have tables `foo` and `bar`. For every `foo` and every `bar`, you want the user to specify a `qux` value. Presently, what you need to do is:
(0) Create a table `foo_bar`, with fields `foo_id`, `bar_id` and `qux`. In particular, `qux` must be nullable. [Yuck!]
(1) Add triggers to `foo` and `bar` that automatically insert or delete rows from `foo_bar`.
(2) Hope [I'm not joking] the user remembers to set all the `qux` values in `foo_bar` whenever he inserts a row into either `foo` or `bar`.
That said it does appear that sums are missing. Infact it seems like every mainstream language manages to fail to add sum types for some strange reason and you have to recreate them from scratch.