Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This is the largest codebase I've seen that makes extensive use of the new Python match: statement introduced in Python 3.10 - example: https://github.com/dosisod/refurb/blob/master/refurb/checks/...


Thank you! I have found that pattern matching has made this project a whole lot easier.

One of the other cool features about this project is how the check() functions are loaded based off the type of the node argument:

https://github.com/dosisod/refurb/blob/master/refurb/loader....

This also supports type unions, so you can register your check for binary and unary expressions by typing node as `BinaryExpr | UnaryExpr`.


That code, starting on line 41, has convinced me to continue avoiding Python. Python has went from a clear concise language with one right way to do things, to outpacing C++ when it comes to adding esoteric language features.


that code is pattern matching on an ast node; the complexity is from the data structure involved, not the language. there is no more concise way to match a specific node and get at the relevant fields, just various different forms of verbose code.


If anything, this matching on an AST is very readable and elegant, and that just convinced me that the syntax of Python seems really good to write a parser.


that AST parsing code looks like something written in Scala ...

seems like Scala is moving to more Python like and Python is moving to add some features from Scala.


I’m actually genuinely curious what alternative you might present from any language. I always thought that was relatively concise for such a complex data structure.


They're complaining about types, :=, and match.


That code has nothing to do with conciseness, he could've extracted the arguments in separate variables


Wow, that's a lot of code smell to me.

Why specify all the specific fragments of mypy.nodes, instead of import *?

Docstrings in the middle of code should just die 1/2 of the screen is wasted

Why use a case statement if it only has one match?

I can't even figure out what lines 43-55 do, but that's a lot of nesting


The reason for using a match statement, even though there is only “one” case, is because it’s matching against a complex and (as you mentioned) deeply nested pattern. Lines 43-56 match the current AST node against the pattern (shown here as pseudo-python):

  with open(_) as f:
    _ = f.read()
(Where _ can be anything, and f can be any name). The traditional alternative to lines 43-56 would be dozens of lines that check the top level of the AST, and if that matches against what the top level of the pattern is expecting, diving one layer deeper and checking that level against the next level of the pattern, etc, until you reach the leaves of your pattern. Once your familiar with pattern matching syntax, this is much easier to quickly read/grok/audit.


So it can do multi-level tree matching?

Very interesting…

One of the projects I occasionally poke at is a burs (bottom-up rewrite system) generator used to do cost-based tree rewriting. Think this is a concept that might make sense in that context, just need to find the time to play…


> Why specify all the specific fragments of mypy.nodes, instead of import *?

glob imports are discouraged as it is hard to figure out where a name comes from (or if it’s just a typo).

> Docstrings in the middle of code should just die 1/2 of the screen is wasted

You mean the class level doc string? That shows up in the help of that class, so there is nowhere else to put it.

I also don’t see the point of a match with a single case.


Does "code smell" just mean "style I don't like"? Most of your concerns have nothing to do with what the code is actually doing.

Line 43-55 is the scrutinee (pattern) of the case statement. It only binds values to the given variables and enters the block if `node` matches the quite complex structure shown there.

The alternative would be doing something repetitive like this:

    if not isinstance(node, WithStmt):
        return
    expr = node.expr
    if not len(expr) == 1:
        return
    if not isinstance(expr, CallExpr):
        return
    callee = expr.callee
    if not (isinstance(callee, NameExpr) and callee.name == 'open'):
        return
    ... etc ...
You could do the same thing with duck typing and try/except, but the semantics would be different (sometimes you DO want nominal typing, even in python).


Why not just something similar to (forgive the pascal)

  if isinstance(node, WithStmt) AND
    (len(node.expr) = 1) AND
     isinstance(node.expr, CallExpr) AND
     isinstance(node.expr.callee,NameExper) AND
    (node.expr.callee.name == 'open') then
  begin
    do stuff here
  end;


> I can't even figure out what lines 43-55 do, but that's a lot of nesting

It's a very elegant pattern match on a deeply nested structure. It's declarative, not imperative, so if you are concerned about "nesting complexity", you need not be.

How would you implement this otherwise?


I honestly don't understand what's going on... but in Pascal I'd use short circuit expression evaluation to make the comparisons as I go into the depths.

There's too many layers of nesting, in too foreign a syntax for me to even parse.

I suspect it could be done a clearer manner in pascal.


I would truly love to see it in pascal. I would be happy to give my opinion on it as a "foreign syntax" to me


In addition to some of the other answers about the match statement, mCoding has this excellent video[0] explaining the match statement, specifically using it to match complex AST trees, like Refurb does. Note, not all of these checks use the match statement[1], sometimes it is more verbose to use a match statement vs a normal if/else statement.

[0]: https://youtu.be/ASRqxDGutpA?t=470 [1]: https://github.com/dosisod/refurb/blob/master/refurb/checks/...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: