Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I agree with the author, we need better primitives, if you need functionality now:

Major tools that exist today for partial structure traversal and focused manipulation:

- Optics (Lenses, Prisms, Traversals)

  Elegant, composable ways to zoom into, modify, and rebuild structures.

  Examples: Haskell's `lens`, Scala's Monocle, Clojure's Specter.

  Think of these as programmable accessors and updaters.

- Zippers

  Data structures with a "focused cursor" that allow local edits without manually traversing the whole structure.

  Examples: Huet’s original Zipper (1997), Haskell’s `Data.Tree.Zipper`, Clojure’s built-in zippers.

- Query Languages (for semantic traversal and deep search)

  When paths aren't enough and you need semantic conditionals:

    - SPARQL (semantic web graph querying)
    - Datalog (logic programming and query over facts)
    - Cypher (graph traversal in Neo4j)
    - Prolog (pure logic exploration)

  These approaches let you declaratively state what you want instead of manually specifying traversal steps.


Agreed; also Traversable in Haskell is a simpler abstraction than lenses and pretty directly addresses what they seem to be looking for: https://hackage.haskell.org/package/base-4.21.0.0/docs/Data-...


Traversable and lenses are very closely linked. If you go to the original paper leading to Traversable [1] and read through it, it feels basically identical to reading through the parts of the lens library that lay down the core abstractions and the laws implementations must follow if you want to be able to blindly manipulate them. In fact, the traverse function is a Traversal, and so fits trivially into the lens ecosystem.

[1] https://www.cs.ox.ac.uk/jeremy.gibbons/publications/iterator...


Arent haskell traversables different in that they preserve the structure if you were to map over them, as compared to the solution posted in the article, where they get flattened to what amounts to a list?


These are what I think the author is looking for. But it shouldn't be a "primitive" in terms of code automatically generated by the compiler, but an interface or typeclass like your examples (in a language advanced enough to have them.)

The problem is that 'lens', 'monocle', etc. are famously abstract and difficult for people to apply to their actual problems. IMO, the solution would be for standard libraries to specify interfaces called 'BreadthFirstTraverse', 'DepthFirstTraverse', etc.


I definitely agree for traversals, but Lenses need some sort of primitive support - even in Haskell they're mostly generated with TemplateHaskell, and the language developers have spent a long time trying to make the `record.field` accessor syntax overloadable enough to work with lenses[1][2]. Hopefully someday we'll be free from having to memorize all the lens operators.

Optics are famously abstract in implementation, but I don't think people have trouble applying them - people seem to like JQuery/CSS selectors, and insist on `object.field` syntax; it's kind of wild that no mainstream language has a first-class way to pass around the description of a location in an arbitrary data structure.

[1] https://ghc-proposals.readthedocs.io/en/latest/proposals/002...

[2] https://ghc-proposals.readthedocs.io/en/latest/proposals/015...



Optics let you concisely describe the location, but defer the dereferencing, so you could definitely approximate optics, not by passing around pointers you compute with `offsetof`, but passing around functions that use `offsetof` to return memory locations to reference (read/write to). You could certainly write a composition operator for `*(*T) => List<*R>`... Some people have done something like it[1][2]:

    Account acc = getAccount();
    QVERIFY(acc.person.address.house == 20);

    auto houseLens = personL() to addressL() to houseL();
    std::function<int(int)> modifier = [](int old) { return old + 6; };
    
    Account newAcc2 = over(houseLens, newAcc1, modifier);
These also use templating to get something that still feels maybe a little less ergonomic than it could be, though.

[1] https://github.com/graninas/cpp_lenses [2] https://github.com/jonsterling/Lens.hpp


Haskell has 'Traversable' (and 'Foldable' etc) which are a lot more approachable than the fully generalised lens library.


> These are what I think the author is looking for. But it shouldn't be a "primitive" in terms of code automatically generated by the compiler

I think people are often too enamored by general purpose languages that can express such abstractions natively. I don't see an issue with a language that provides this as a primitive without being able to express it itself, constraints can be useful for other properties. Once you can traverse trees, most programming problems can be tackled even in such constrained languages, eg. SQL with CTE.


To point out a prolog thing which is also applicable to other languages with good patter matching: the break/return/prune examples are all ergonomic to implement as recursion in a way that fails in C++ style type based dispatch.


Don’t forget recursion schemes. The last example in the article was just asking for a hylomorphism.


Honestly when I read the article I was, "Do you have a minute to talk about our Lord and Saviour, derive (Functor) and how her son hylo saved us all?"


In Rust, you can extend the syntax a bit within the language itself.

For C++, you could define yourself a template that expands to the two functions you listed.

For any language, you could write yourself a pre-processor that adds for_tree notation and expands it, either with pre-processor semantics or working on the abstract syntax tree (which is more "proper" but also more work"). I would recommend the former to test notations, and then you can live with them for a while to experiment with them and see if and how often you really need the construct (recall that is how C++ evolved from C via "C with classes" - it was first a pre-processor). Once you and others are 100% convinced of the new syntax, go to your prefered language's working group/ISO committee and propose inclusion.

My own feeling is that this is not something for which I need more than recursion; calling inside traverse() traverse(left) and traverse(right) for binary trees or using a normal for loop to iterate over all this->children() from 0 to this->children.size() is something that occurs in graph libraries and parsers/compilers once in a while but not often enough to warrant its own notation. Rather, when I look at languages like C++, I'd have a language that is simpler, cleaner, more homogeneous and more orthogonal; C++ is complicated, convoluted, too large to implement it correctly by a single person in reasonable time (compared to beauties like C, Pascal, Scheme), so I stand on the side of minimalism.


The "syntax extension" thing is precisely the reason I have an interest in (and advocacy for) Common Lisp.

Be that as it may, for C++, Eric Neibler's [Boost.Proto](https://www.boost.org/doc/libs/1_84_0/doc/html/proto.html) could likely help conveniently connect syntax to implementation to achieve something similar to what the author is taking about.


SQL recursive CTE maybe?


Even just traverse from scala, haskell, kotlin, etc will work iterating over trees.


how about we also get regex-parsable streams (IO::Async in perl has something like it, suboptimal perhaps) and regex-parsable treestructures (totally possible)? seems like just having the ~= work on structures (or whatever the API is called in other languages, this being Perl5)?


Does this mean something like $tree_object =~ m/regex/ will work and return a sub tree?


Yeah and if you think about it is totally doable. The FSM or whatever automata is just picks and exhausts branches that provide the correct value.

Definitely can be done to streaming data… protobufs in a way is this, given it’s a sort of BNF from a bird’s eye.


I still love the Scrap Your Boilerplate papers.

https://wiki.haskell.org/index.php?title=Research_papers/Gen...


Seriously, I thought of those the moment I read the title of this post.


This seems like dramatically over complicated iterators. Do they really need to be called 'optics' and 'lenses' and 'prisms' ?

Think of these as programmable accessors and updaters.

How is iterating through something already not 'programmable' ?


Indeed they call for new names, as they encompass far more than iterators.

If you read a bit more about them, I think you will be surprised to see the breadth of what these abstractions can be used for. To start, they've been used to build a new compositional foundation for game theory[1], and they can be used to model gradient-based learning in machine learning[2].

As for their simpler applications as getters and setters, they are programmable in the sense that you can think of lens/prism types as interfaces that can be implemented arbitrarily. So you can create your own lenses and combine them like legos to construct new, "bigger" getters and setters from the smaller components.

[1] https://arxiv.org/pdf/1603.04641 [2] https://arxiv.org/abs/2103.01931

EDIT: fixed typo


This thread is about traversing a tree. At what point do we take a step back and think that iterating through a data structure and "building new compositional foundations for game theory" shouldn't be conflated together?

When does someone give up on the pagentry of programming and just get something done by looping through data instead of "constructing getters and setters to model gradient based machine learning".

It really seems like the straight forward way of doing things is to make simple iteration and worry about the game theory after that is all figured out.


I do get, and to some extent sympathize with, your position. But the comments to the article are, in large part, fundamentally addressing a higher level of abstraction than the more narrow scope of the article’s premise. As several commenters have referenced, traversing a tree in a ‘simple’ and narrow manner is addressed at the appropriate level of abstraction using already established mechanisms, particularly in Haskell using Traversable (a typeclass with a standardized implementation for trees). The discussion of Optics and Lenses are more of a side discussion about the syntax and implementations of a more broad set of data structures than merely trees.

I think your comment is valuable in spirit because it could bring a more thorough discussion of the creation or popularization of a syntactically clean and a shift of the idiomatic-ness of such a solution to traversing trees. Leaving the more abstract and general Optics discussion to be a side dish for commenters to discuss as well.

Also, just a nitpick, but traversing a tree is a broader topic than iteration. Iteration implies, both colloquially and formally, performing a computation on each element of a data structure, while traversal is a more general algorithm that could be the basis of iteration, but may also be more akin to a search through the data structure without expectation of performing computation until the searched for ‘member’ is returned from the traversal. So it’s not an apples-to-oranges comparison with your conflation of iteration and traversal, but does seem a little like Canis Familiaris-to-Mammal comparison.


I read your whole comment but I'm still not getting where the actual rubber meets the road. If I have a tree data structure what am I missing? What is it that I can't do that makes this extreme iteration complexity worthwhile? To me, having simple access to the data structure is enough because that's what I want a data structure for.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: