I am a computer scientist working in programming languages (so with no particular expertise in combinatorics).
In my experience, treewidth is one of those ideas at the outer limits of my ability to understand.
I have spent several hours staring at the idea on several different occasions, and at the end of each of these sessions, I come away with a vague sense of why it is important and why the definition is natural, only for its complexity to completely overwhelm me the next time I encounter the concept.
The cops-and-robbers definition of Treewidth is quite intuitive.
You have an undirected graph with a robber moving infinitely fast along the edges of the graph.
You have k cops, each driving their own very slow helicopter that can land on any vertex you want (but it takes some time). The cops can communicate and they know at any time where the robber is.
Given a graph, how many cops do you need to catch the robber, i.e., trap the robber on an edge?
In a tree: 2
On a cycle: 3
In an n × m grid graph: you need min(n, m) + 1.
---
In the game where the cops don't know where the robber is, the "width measure" is called pathwidth.
From wikipedia https://en.wikipedia.org/wiki/Carving_width#Related_paramete...:
“[…], it can be shown that for any graph, the carving width is greater than or equal to half the branch width, and is less than or equal to the degree times the branchwidth. Because treewidth and branchwidth are always within constant factors of each other, similar bounds can be used to relate carving width to treewidth.”
A graph is fundamental notion in computer science because it allows us to model numerous interesting real life problems.
Trees are special kind of graphs that are structurally very simple thus many interesting problems are very easy to solve.
Some graphs are not trees but are similar to trees and thus on such graphs we can leverage the tree-like similarity and solve interesting problems much more easily and efficiently.
Treewidth formally captures the following idea: measure how much a graph is similar to a tree.
The notion allows us to formally analyze and quantify complexity of the algorithms that make use of tree-like similarity.
There is analogical notion for path-like graphs, called pathwidth.
The treewidth is much more interesting in practice but the definition of pathwidth (https://en.wikipedia.org/wiki/Pathwidth) is probably a bit easier to understand so you might look into that first.
The idea in both cases is to decompose a graph into not necessarily disjunctive bags (sets) of neighboring vertices with additional bagging rules which ensure that the decomposition itself is a tree or path, respectively, when interpreting the bags as nodes.
More formally, a tree decomposition t of a graph G is labelling nodes of t by bags of vertices of G such that
1. every edge of G is contained in a bag of t,
2. for every vertex in G, the set of nodes in t whose bags contain v is connected via the child relation.
For a given graph one can construct many tree decompositions but it is harder get ones with smaller bags but the smaller the bags the more similar to a tree the graph is.
We say that a graph has treewidth n if there is a tree decomposition in which each bag has size at most n+1.
In particular, a tree has treewidth 1 because we can always construct a tree decomposition with bags of size at most 2.
There are different equivalent characterizations of the notion, one of them is the cops-and-robbers definition, which already appeared in a comment earlier https://news.ycombinator.com/item?id=42695879 and was introduced in https://thomas.math.gatech.edu/PAP/search.pdf.
To give an idea why the treewidth can be defined by it observe the following.
Playing on a tree you do not need too many cops to capture the robber as you can block the fast robber by putting cops on two neighboring vertices and move toward the robber.
So for general graphs you can block the robber by filling two bags of a tree decomposition and move toward the robber.
I am not a mathematician, but here is a motivation I read somewhere some years ago.
There are basically two ways to produce big numbers: add two small numbers, or multiply two small numbers. You can produce all positive integers by starting with zero and repeatedly adding one. You can almost do the same thing with multiplication too, except for these pesky primes, which are somehow atomic. Naturally then, one might ask: (a) How many primes are there? (b) How frequently do they occur? (c) Can we look at a number and determine whether it is a prime?
Now consider: Despite being among the oldest of the mathematical disciplines, there are still open problems about primes that can be explained to high school students.
Also, multiplication and addition are not simply operations that are of interest with respect to integers, but similar ideas apply to a bunch of other domains too. Polynomials, for example. So primality and primality-like ideas are like catnip for mathematicians.
Perhaps another way of asking the question is: Are there any results, either about individual programs or in PL theory as a whole, that were made simpler / clarified / generalized because of category theoretic insights?
Yes, but it's much, muvh faster and more productivr to just learn Haskell's type system (mainly Functors, Monads and Applicatives) as 3 individual "design patterns" than it is to figure it how you can even begin to apply the Yoneda Lemma to whatever programing problem you have in front of you.
Category Theory has birthed some useful abstraction, but you don't really need any of CT to learn and use the abstractions.
The basics of Abstract Algebra on the otherhand are absolutely worth the bit of time it takes to learn them.
Groups, semi-groups, rings, fields, monoids - distribution, identity, zeros, associstivity, communitavity are pretty trivia to learn - most people already know the underlying concepts and they pop up all the time in programing.
Monoids are incredibly useful - most people already know them, and intuitively use them, but it's helpful naming and understanding the pattern.
Category Therory just doesn't have the same return on investment for software development.
There is a very useful perspective in which categories are just typed monoids, and the monoid operation can only be applied when the types "line up". For example, here are some useful operations which do not form a monoid:
- PUSH(n) where n is a floating-point number
- POP
- CLEAR
- ADD, SUB, MUL, DIV
- ID
These can be interpreted as operations on a stack of floating-point numbers in the obvious way, PUSH(1.2) * [3.14] == [1.2, 3.14], POP * [1, 2, 3] == [2, 3], ADD * [1, 2, 5] == [3, 5], CLEAR * [1, 2, 3] == [], ID * [1, 2, 3] == [1, 2, 3], etc. However, not all of the compositions of stack operations are legal. For example, ADD * PUSH(1) * PUSH(2) is fine and equivalent to PUSH(3), but ADD * PUSH(1) * CLEAR is illegal.
Ok, so our stack operations don't form a monoid. But they obviously can still be composed, sometimes, so what do we have if not a monoid? They form a category! There is one object for each natural number, representing the height of the stack. So there are arrows like PUSH(3.14) : Height_{n} -> Height_{n+1} for all n, and POP : Height_{n} -> Height_{n-1} whenever n >= 1, and ADD : Height_{n} -> Height_{n-2} whenever n >= 2.
Another common example is matrices. Square matrices form a monoid, but what about arbitrary rectangular matrices? They don't form a monoid, but they do form a category where the objects are natural numbers, and the arrows N -> M are just the MxN matrices. You can't multiply any two matrices, but if you have a P -> Q matrix (QxP) and a Q -> R (RxQ) matrix then you can multiply them to get a P -> R matrix (RxP).
Monads are a traditional stumbling block for Haskell newbies, including me when I was one. I found that those articles I linked demystified them more than any number of "monad tutorials" ever could do.
The thing about questions like this is that the complexity of mathematical explanations is highly dependent on what each reader considers "simple." Consider two different approaches to understanding a concept, expressed in information-theoretic terms:
This additional complexity layer of categorical framing has a nonzero cognitive cost, which varies based on the learner's intuitions and background. The investment in learning categorical thinking only becomes worthwhile when the learner can amortize its cost by envisioning its broad applicability - when they can see how categorical frameworks enable systematic problem decomposition and generalization. This implies they've already recognized the limitations and redundancies of non-categorical approaches, understanding intuitively how many concepts would need to be reinvented as the problem evolves within its conceptual framework (gestell).
However, there exists a middle path that often goes unrecognized as categorical thinking, despite embodying its essence. This approach involves incrementally discovering generalizations -- "oh, this can be generalized with this type" or "oh, if I wrap this in another type I can do something else later on" or "oh this syntactic sugar for this particular operator overload feels quite nice"
Yes, there are a number of them. Here are some examples off the top of my head:
- Moggi was studying the problem of equivalence of programs, and noted that the traditional approach to modeling a program as a total function Input -> Output is problematic. He pioneered the use of monads and Kleisli categories as a foundation for reasoning about equivalence of real programs, including all the real-world nastiness like non-termination, partiality (e.g. throwing an exception that kills the program), non-determinism, and so on. https://person.dibris.unige.it/moggi-eugenio/ftp/lics89.pdf
- Linear logic (and it's close relative affine logic) was the inspiration behind Rust's ownership model, from what I understand. Linear logic was originally described in terms of the sequent calculus by Girard (http://girard.perso.math.cnrs.fr/linear.pdf), but later work used certain categories as a model of linear logic (https://ncatlab.org/nlab/files/SeelyLinearLogic.pdf). This answered and clarified a number of questions stemming from Girard's original work.
- Cartesian-closed categories (CCCs) form models of the simply-typed lambda calculus, in the sense that any lambda term can be interpreted as a value in a CCC. Conal Elliott pointed out that this means that a lambda term doesn't have just one natural meaning; it can be given a multitude of meanings by interpreting the same term in different CCCs. He shows how to use this idea to "interpret" a program into a circuit that implements the program. http://conal.net/papers/compiling-to-categories/
- There is a classical construction about minimizing a DFA due to Brzozowski which is a bit magical. Given a DFA, do the following process twice: (a) get an NFA for the reverse language by reversing all edges in the DFA and swapping start / accept nodes, then (b) drop any nodes which are not reachable from a start node in the NFA. The result will be the minimal DFA that accepts the same language as your original DFA! Bonchi, Bonsangue, Rutten, and Silva analyzed Brzozowski's algorithm from a categorical perspective, which allowed them to give a very clear explanation of why it works along with a novel generalization of Brzozowski's algorithm to other kinds of automata. https://alexandrasilva.org/files/RechabilityObservability.pd...
- I would also put the development of lenses in this list, but they haven't leaked very far outside of the Haskell universe yet so I don't think they are a compelling example. Check back in 5 years perhaps. Here's a blog post describing how lenses relate to jq and xpath: https://chrispenner.ca/posts/traversal-systems
- I've personally had success in finding useful generalizations of existing algorithms by finding a monoid in the algorithm and replacing it with a category, using the fact that categories are like "many-kinded monoids" in some sense. I haven't written any of these cases up yet, so check back in 2 years or so. In any case, they've been useful enough to drive some unique user-facing features.
This comment pairs very well with the recent thread on Heaviside’s operator calculus. He got ahead of theory, did groundbreaking work, and was ultimately superseded by more principled approaches. I think this illustrates a kind of intellectual leapfrog. Sometimes we do things that are right, but it’s not clear why they work, and other times we have theoretical structures that open up new realms of practice.
I just realized I botched the description of Brzozowski's algorithm, step (b) should be "determinize the NFA using the powerset construction". Mea culpa.
If I understand correctly: The idea is to scan every ballot, and upload all scans to a public website. The system preserves anonymity because voters are not required to write their names or other PII on the ballot paper.
I still have lots of questions:
1. Doesn't this system raise the possibility of coercion? For example, a goon or abusive spouse might, under threat of violence, force you to vote in a certain way and mark your ballot for them to audit afterwards. Isn't plausible deniability also one of the key desiderata of the election process?
2. The system allows me to mark my ballot paper and confirm that my vote was correctly counted, after the fact. But I still need to trust all the other votes uploaded to the website. Of course, the presence of independent election observers (who watch the counting process and the ballot boxes being moved around) would mitigate this fear.
As someone who is intimately familiar with Datalog, but have not read much about Logica:
The way I read these rules is not from left-to-right but from right-to-left. In this case, it would say: Pick two numbers a > 1 and b > 1, their product a*b is a composite number. The solver starts with the facts that are immediately evident, and repeatedly apply these rules until no more conclusions are left to be drawn.
"But there are infinitely many composite numbers," you'll object. To which I will point out the limit of numbers <= 30 in the line above. So the fixpoint is achieved in bounded time.
Datalog is usually defined using what is called set semantics. In other words, tuples are either derivable or not. A cursory inspection of the page seems to indicate that Logica works over bags / multisets. The distinct keyword in the rule seems to have something to do with this, but I am not entirely sure.
This reading of Datalog rules is commonly called bottom-up evaluation. Assuming a finite universe, bottom-up and top-down evaluation are equivalent, although one approach might be computationally more expensive, as you point out.
In contrast to this, Prolog enforces a top-down evaluation approach, though the actual mechanics of evaluation are somewhat more complicated.
Bottom-up? Ok, I see. From the tutorial it seems Logica reifies every predicate into an actual table (except for what they call "infinite predicates")
I found a way to look at the SQL it generates without installing anything:
Execute the first two cells in the online tutorial collab (the Install and Import). Then replace the 3rd cell content with the following and execute it:
%%logica Composite
@Engine("sqlite"); # don't try to authorise and use BigQuery
# Define numbers 1 to 30.
Number(x + 1) :- x in Range(30);
# Defining composite numbers.
Composite(a * b) distinct :- Number(a), Number(b), a > 1, b > 1;
# Defining primes as "not composite".
Prime(n) distinct :- Number(n), n > 1, ~Composite(n);
Where does your intimate knowledge of Datalog comes from? Do you use Datalog regularly (maybe with Datomic)? Or you just studied how it is implemenced out of curiosity?
I can't say that I've understood it all, but he appears to criticize scientists for not thinking about play seriously, and instead reducing most aspects of animal behavior to things like survival, fitness, and evolutionary pressure.
> Animals (and humans) evolved play because it has evolutionary benefits
I know the turn of phrase is popular even amongst biologists but I still think it’s weird to put it this way.
Evolution is a dynamic feedback loop on a multi-generational time scale. Plenty of neutral things can be transmitted for a long time without being culled out by evolutionary pressure and social behaviour can remain for a long time without being genetic at all.
Play has a purpose, it hones reflexes, teaches youth about concealment, traps, ambushes, what their bodies can do, whether running away works in a scenario or fighting is better.
It also shows who is best to lead a fight. Play much like curiosity, makes you able to navigate your environment. It very much has extreme usefulness.
That’s not the point. Evolutionary pressure doesn’t care if something is or isn’t useful per see. It’s all situational anyway.
To simplify a lot, either you have offsprings and your genes spread or you don’t and they don’t. Evolution is not a purposeful process towards fitness. It’s a reification of the results of the way genes are passed, how they and the environment affect individual characteristics and how an individual fitness to their current environment impacts their chance to mate.
That's not accurate at all. Evolution does care if something is useful, if it aids in reproduction. Learning how to fight, protect, run, hunt, tricks, all of these (such as ambushes, hiding, etc) helps both prey and predator survive in the wild.
Play hones reflexes. Its entire purpose is to train young animals on tactics, and on how to use their body, and on their local environment.
Put another way, if you aren't trained to use your body, you're more likely to die. The same goes with not learning tactics. Or what the local environment is like.
If you don't know what a tree is, if you don't know what a hole in the ground is, what a hill is, how well grass hides you or not, you are at a major disadvantage, if you're hunting, OR if you're hunted!
There can be other mechanisms to learn things, but play is one of them, and having children teach each other, lets the adult protect, and gather food to feed. It also ensures that youth is trained up on the current environment, not one that the parent recalls from youth.
> Evolution does care if something is useful, if it aids in reproduction.
Evolution is an abstraction subsuming other mechanisms. It doesn’t care about usefulness. That’s a finalist bias. Individuals reproduce or they don’t.
The rest of your post is pure conjecture. You can’t work backward saying: this thing is useful therefore it’s evolved. That doesn’t make sense especially for complex behaviour.
Sure, maybe, but if you tried you could come up with a similar explanation for literally any behavior or emotion. It might be true, but it isn’t falsifiable.
Does the complicated flowchart point to deficiencies in the Slack user interface? If the user cannot intuit the flowchart, then how can they (as several sibling comments rightly point out) reliably turn notifications on or off?
"Feature flag" development culture is in direct conflict with user's having intuitive, consistent experiences that they can model in their head.
If the vendor needs a database report to see what features the user may encounter in any given session, because it's an n-dimensional matrix that changes based on uncountably many factors, there is no mental modeling to be done. The user just experiences some idiosyncratic amalgam of code in each session, and the vendor watches aggregate metrics to make sure the number of users in immediate crisis remains below some threshold -- bringing in a $XXXX/hr on-call team to identify and apply some adequately impactful change if it breaks over. Meanwhile, the users-in-crisis cross their fingers that the next time they open the app, they get a better roll on the feature matrix and encounter a more tolerable subset of issues.
If you want to be able to understand your software and know how to turn things on and off, you need to demand a whole new (old) approach to building and publishing things. We're way off track right now.
Feature flag development culture is the only practical way to do continuous delivery. The old approach is still used in markets where it makes sense - database systems, for example, often have only a handful of releases per year. But there's a lot of markets where it's just not feasible to make customers wait 3-6 months between when you finish developing a feature and when it's available for use.
> Feature flag development culture is the only practical way to do continuous delivery.
1. That's not true. Even with continuous delivery, you can still just have a sane and mostly stable roadmap that you're betting on, and gradually migrate all your users along it in a cohesive way with thoughtful transitions when any radical changes are needed. Feature flag stuff is a specific approach to product design that took over a lot of engineering teams in the last 15 years. It relies on continuous delivery, for the most part, but continuous delivery is itself completely ambivalent to it.
2. Continuous delivery is rarely necessary in the first place, and almost never in the interest of users. It gained ubiquity as it became first became practical and coincided with product designer's naive enthusiasm for feature flags, but mostly works against user satisfaction by making it hard for users rely on their tools being in a specific condition at any time.
Maybe Continuous Delivery is the problem. What are these markets where customers want such frequent changes? As a user, I don't really want my software changing daily, weekly or even monthly. I don't want features to come and go depending on which arm of an A/B test I am in. I don't want to be part of the developer's "experiment" and metrics-gathering.
The actual need for continously delivery is mostly related to running public-facing services at large scale that need the ability to adjust to external events quickly, e.g. when defending against a new form of abuse requires an immediate software change.
Al other software could get away with deliberate release cycles rather than an urge to YOLO things into production. I just think that there is a fallacy in business leadership that ultra short turnaround times on features actually matter for a business.
Engineering is about making certain that stuff works before it enters production. If we claim to do software engineering, then that's just a simple part of doing our job. Everything else is just tinkering and hacking (the bad kind).
From an engineering point of view, yes. But not from the business point of view.
It's like a restaurant. You can make sure they new dish tastes like what you intended and isn't going to poison anyone, but until it's on the menu you don't really know if people will want to eat it or if it will be a success for the business.
This does make me wonder whether there's a place for a system where you split your usebase into N shards, and only deploy new versions to one of them; reducing the update rate from a user perspective by a factor of N. Of course, this means you have to maintain compatibility with N versions, but that's still better than N feature flags.
That whole approach is so disrespectful of users. If you are a serious business, then hire sufficient QA people, and empower them to do their job.
Expecting your customers to do your QA for you is super common these days. But that doesn't make it right.
(What really infuriates me is the general lack of feedback channels that enable users to actually contribute to QA. With open source software, end users can at least open a ticket on the bug tracker. If it's a commercial product, all too often the only point of contact is a disempowered call-centre worker, who can't even log a bug for you.)
Agreed. Speaking as a user, I much preferred when the industry in general had a slower release cadence. I want to update once every year at most, and usually I want to update less frequently than that. Continuous delivery is a thorn in my side.
No individual customer wants frequent changes. They want changes they care about ASAP and changes they don't care about never. But unless I'm maintaining per-customer versions (which can make sense for high value enterprise customers!), promising user X that I won't make any changes until January necessarily means telling user Y that I refuse to resolve their showstopping bug report until January.
I'm right-handed, and usually use scissors with my right hand. When I try using them with my left hand, two things happen: First, they doesn't sit right in my palm, because of the way they are sculpted. Beyond this, and more importantly, they are no longer as effective in cutting things.
Here's my understanding of why this is. When holding the scissors with your hands, in addition to the up-and-down force you exert on the blade with your fingers, you also exert a small side-to-side force with your fingers. With my right handed scissors in my right hand, this force pushes to the outside on the upper handle and to the inside on the lower handle. This force also makes the scissors feel more comfortable.
On the other side of the fulcrum, though, the upper handle controls the lower blade and the lower handle controls the upper blade. Here, the lateral forces end up drawing the blades closer together, giving a tighter pair of edges between which shearing forces are applied. This makes the cutting action more effective than if lateral forces were absent.
In my left hand, the (outside at the top and inside at the bottom) lateral forces end up pushing the blades further apart on the other side of the fulcrum. This reduces the shear force and makes the cutting action less effective.
To compensate for this while operating the scissors with your left hand, you would need to adopt a weird style: Consciously pull to the inside with your thumb, and to the outside with your remaining fingers. You'll notice that the scissors are now much more effective than before. It is also a deeply uncomfortable grip.
The issue is that scissors are (surprisingly) chiral sculptures. In the case of regular right-handed scissors, when viewed edge down, the handle closer to the viewer passes through the left of the fulcrum. I have never used a pair of left-handed scissors, but I would presume that for them, the closer handle passes through the right side of the fulcrum.
This is exactly right. Scissors depend on you to exert sideways pressure on the blades. It should be possible to build scissors with e.g. a spring at the joint that exerts the proper sideways force automatically. This is how paper cutters work. (A square surface with a blade on the side that swings down.)
I see a number of comments here about giving awards to organizations rather than individuals, and counter-comments pointing out that Nobel's will disallowed it.
How is the Nobel Prize actually administered? For how long is the Nobel committee bound to follow Alfred Nobel's will? And aren't there laws against perpetual trusts? Or is the rule against awarding the technical awards to organizations one that the committee maintains out of deference to Nobel's original intentions?
In my experience, treewidth is one of those ideas at the outer limits of my ability to understand.
I have spent several hours staring at the idea on several different occasions, and at the end of each of these sessions, I come away with a vague sense of why it is important and why the definition is natural, only for its complexity to completely overwhelm me the next time I encounter the concept.