Hacker News new | past | comments | ask | show | jobs | submit login
WTFPython: Exploring and understanding Python through surprising snippets (github.com/satwikkansal)
251 points by Tomte on Aug 27, 2023 | hide | past | favorite | 80 comments



Some of the substantial previous discussions:

* https://news.ycombinator.com/item?id=31566031 (460 points | May 31, 2022 | 139 comments)

* https://news.ycombinator.com/item?id=26097732 (359 points | Feb 11, 2021 | 162 comments)

* https://news.ycombinator.com/item?id=21862073 (396 points | Dec 23, 2019 | 185 comments)


This is drives home some of the things I really like about scheme. Scheme is built on a few selected primitives and much of the language comes together on top of that.

Since learning that I have started disliking all the things that I just have to accept in other languages. I can accept the c++ variant of "because this and that comes together like this. Because speed", but the pythonesque "because because" drives me insane.


Are you using Scheme in production? I'm super curious about it being applied in the "real world", as I'm currently working my way through SICP.


No way. Scheme is hobbyist language at this point.

Simply put difference between a 'hobbyist' and professional language for me is the strength and depth of the packaging ecosystem.

So as soon as you need to do things like open and parse PDF, implement TLS/SSL, create a dashboard,... you will be in trouble pretty quick in scheme. Probably simple stuff like handling unicode would be shaky.


Nobody would re-implement TLS in a garbage collected language, at least if they planned to use it. There are libraries for that. And who kn their right mind would implement PDF parsing themselves? The more I familiarise myself with the PDF format the happier I am with the libraries that exist.

I did write a web app that served a couple of hundred people with information, chats and live updates at a conference a friend was hosting. It was written in guile and made heavy use of delimited continuations which is probably my favourite way of handling state on the web. The main logic was less than 2000 lines, so it was by no means a large project.

What about the unicode handling would be shaky? And why do you say handling unicode is simple?? :)


Afaik racket can do all of what you have mentioned. I have run a few slightly non trivial dev systems using it with good success. Ofc, company never knew about it.


Nope. I am not a programmer, except for that time I got paid to write common lisp. The only thing except that I have done for money is some data processing using python.


Wow. Can't believe a couple of these are in my production. Esp with the mixture of py2 and py3. And people who are not well versed in python are just not aware of these gotchas.

Good repo, I learnt quite a few more... wtfs. :)

I have a strong belief that unless type hints, testing, tooling is leveraged properly, python does not qualify as a good candidate for long term projects. Go handles all of this automatically and I'm having good results with it as an alternative to Python for small to medium scale projects.


Consider not holding strong beliefs without empirical evidence. There are both plenty of long term python projects that have held up, and no strong evidence one way or the other on type safety.


The only evidence I've seen in this respect suggests typing is not the leading cause of bugs in Python. I wish I could find the talk but essentially the guy looked at a load of GitHub issues and analysed the cause and only like 1% or so were type related.

I like putting in types just to help my IDE with autocomplete, though. I have also caught a few errors with mypy that could have caused a crash in very unlikely cases. I'm not convinced it's worth rigourously typing a project. It seems like mostly a nerd snipe because I've noticed there is a satisfaction with getting in right despite not making any difference to the user.


I'd say requiring types enables faster onboarding and development for new developers than catching bugs. Bugs can only be caught to a certain extent. But figuring out what a each variable is and what they signify when passing them around a large codebase, it becomes essential and helpful to type the code. Pair this with a good IDE/editor, your development is pretty fast.


There are both merits and demerits to python. Unfortunately not everyone has the opportunity or the luxury to work with projects that are properly maintained. Even messy python projects in legacy companies hold up but that does not mean quality and maintenance costs are good. My beliefs stem from my experience with the jobs available in my area and country. I'm not the expert on type safety but my understanding from various sources and my own experience is that type safety is superior. Go has been the best candidate as the alternative for the projects I have undertaken. YMMV.


For most of these type hints or static typing do not at all help. I haven’t went through all of them, but in the first quarter or so almost all of them were due to leaky compiler weirdness in CPython.


I haven't tested out the repo with type hints so I can't comment on that. What I did mean to say was outside of this repo, type hints are a huge boon saver. There are enough legacy projects written by non python experts which would benefit from whatever I said above.


I agree, there should be a book specifically made for python-wtfs, e.g. corner cases, practical hacks, name it as 'python traps and pitfalls'


One python library WTF I came across recently is the autojunk parameter of the difflib SequenceMatcher class. Despite passing constructing the class with

    difflib.SequenceMatcher(isjunk=None, ...
The matcher will start classifying characters as junk once the b parameter is more than 200 characters long. So unless you read the docs carefully, your matcher will start not matching anything longer than a few characters once the input gets longer than a few sentences. The trick is to also set autojunk=False in the constructor.

Its like the original writer of this component wrote the matcher for a specific application (DNA matching?) and left in the heuristic even though it isn't applicable to general use. At some point they added the autojunk parameter to gate this behaviour, but it still defaults to true, to keep backwards compat, and to confuse people.

https://docs.python.org/3/library/difflib.html#difflib.Seque...


Python's "int" optimization of pre-allocating objects for values between -5 and 256 can be a great source of fun when writing extensions, too.

Same mechanism, overwriting the value stored in the allocated object; but when the extension is in loose C code which is casting pointers with abandon... things can go wrong.


Honestly, I never understood why Python gained such a good reputation given the scale of its warts and don't get me started on its crippled lambda implementation.


Use it for a while and you'll understand why. It's a joy to use. That is successful despite its warts just goes to show how nice it is compared to other languages.


I had forgotten about the walrus operator. I’m curious what the rationale is for not just allowing the regular assignment operator within expressions for that feature. Would it confuse the interpreter or something?


I would guess, to make sure the classic noob mistake of typing `if x = 3:` instead of `if x == 3:` stays a syntax error


Depending on what you do on the lhs of = you can actually overwrite this operator.

It could be setattr or setitem instead of bind-to-name. Because you can overwrite = only in certain cases that are defined, by the syntax, you can't easily elevate = to an expression and be done with it.


Very nice website, I love these counterintuitive yet non absurd examples.


Love this. I’ve written a ton of Python in recent years but never used the ‘is’ operator. After reading this I’m glad I did not.

I can see usecases but clearly it should be used sparingly.


To be honest this response underscores the problem with pages like this. In Python, strings and numbers are objects, and “is” tells you if they are the same object. You wouldn’t compare strings or numbers in C using a pointer comparison, and you shouldn’t do it in Python either. The fact that it works sometimes in cpython is a coincidence.

It’s interesting to learn about how the interpreter is implemented, but that’s about it.


You should absolutely be using `is` where appropriate. `x is None` is almost always preferable to `x == None`. If you're checking for object identity, use `is`. If equality, use `==`. They're different use cases.


Especially since == can be overwritten, while 'is' can not.


That's right. Any class can define its own __eq__ method.


How is that your response, after reading this article? The is operator checks whether the two items are the same object, which is critical in some circumstances.


How do you check for None?


Or check that two dicts really are the same object, as opposed to two different dict objects that just happen to have the same keys/values?


I'm not sure I've ever had to do that. When is that a need?


My immediate first thought is, optimisation? If you know you’ve been the same object, you could skip e.g. comparison, or change update logic


This kind of micro optimizations don't make much sense in Python. They complicate the code, and you are still 100 times slower than compiled languages.


It really… doesn’t have to complicate. And I disagree that optimising python code is never necessary. Not everybody is writing 100-line one-off glue scripts.

Also, you are somewhat changing the topic from “what is an example of when you might want to is-compare two dicts”, no?


You are correct. I kind of took `is None` for granted as it just feels boilerplate when coding in Python.

Although I have written over a hundred thousand of lines of code in Python over the years; I use Python mostly for dev ops tooling, reporting, monitoring and automation so they don't get super complex and they mostly can lean on procedural programming patterns.

I could imagine complex frameworks needing heavy use of Objects that could lean on the 'is' keyword.


I personally do use "if var is None:", but can't you just use "if var == None"?


‘is’ checks if it the object ids are the same, with None having a unique one. Equals can be tricked.

Here’s a class that is equal to None, and everything else:

    class EqualsEverything:
        def __equals__(self, other: Any) -> bool:
            return True


Thanks, I saw this on SO too.

Just curious: would it ever happen in practice?


every little thing happens in practice... usually unnoticed and buried while refactoring something innocent/ly.

sooner or later the __eq__ method will be redefined for some class, then reworked, and then.. == None might not be what was supposed to be..

or, my favorite, x='a' ; (x,)[0] == x[0] == x .. but are only equal until x changes to something not-1-long-sequence..


I try to avoid statistical programming, preferring determinism, driven by intent. ;)

So, I use "is" since "is" is not a context dependent concept, like equals is. I've seen this once in the wild, and it made sense for its use:

    def __eq__(self, other):
        return bool(self) == bool(other)


It's destroying my stupid brain but interesting for sure.


Coming from C++ in python, the thing that still gets me when I am tired is the arbitrary decision of passing some arguments by reference vs by value.

If you look at the argument list you don’t know what will be passed by value and what by reference. You will have to guess their types first. And that is risky.


Except that nothing is passed by reference (or everything is, depending on how you look at it). It's just that some objects have methods which can modify state (like list, dict) and others don't (like int, str). Ned Batchelder did a good presentation on this many years ago which is still probably the best I've seen for explaining how it works to people coming from other languages - https://nedbatchelder.com/text/names1/names1.html - where the names passed to a function act just like regular assignment.


But it works. It's like being sceptical of Spanish because of pronoun elision. It's no more or less good than any other language. Bugs are bugs. I've never actually had to learn these rules in Python. But in C/C++ they crop up very quickly.


I would find this funny, if not each and every one of these represents a potential bug in software. Some are really esoteric (like the name mangling example) but many are something you might do accidentally.

Sadly, there are no easy paths to fix this because compatibility (https://xkcd.com/1172/), but for greenfield projects which aren't expected to be small throwaway projects, using Python is not necessarily a very good idea.


It's a huge leap to conclude "for greenfield projects ... using Python is not necessarily a very good idea".


It's certainly a leap... but is it wrong though? You might conclude either "yes" or "no", but dismissing it outright is I don't think very conductive to a good conversation.


Unjustified leaps tend to be wrong. And yes it's wrong because if you apply the criterion consistently then you will end up never choosing any language or tooling ever because everything has some kind of problem.


Sure, but you can choose the one which is the most suitable with the least amount of problems. I would argue that Python has many problems, so it's suitable for a fairly small set of use-cases. (by design, btw)


shit like this is why anyone calling themselves a software engineer should have to get a license and formally sign and seal their code


I still have the feeling that those oddities are still a bit less bad than the ones of js.

I am a still disappointed by python because I am so addicted to all the fun things of python, yet python is inadequate for game development.

I use godot, which has a python-flavor language, but it's missing A LOT of what I love about python: list comprehension, tuple, set, and many others. And now that I think about it, it's going to be difficult for them to evolve the language, although I often prefer to break codebases.


> I still have the feeling that those oddities are still a bit less bad than the ones of js.

Talk about setting the bar low.


I'm overall not super happy with python's game ecosystem too (especially the lack of gui based editors/engines) but if you're willing to look past the lack of an editor Panda3D is a pretty good/robust game engine for Python. Disney used it for their ToonTown and Pirates MMO along with a lot of themepark rides (visualiztion/actually using it in the ride) so it's pretty mature. It's a lot more flexible then Godot and you get access to the entire Python ecosystem.


I'm honsetly at a loss to understand why anyone would expect a single-threaded, dynamically-typed language like Python to be anywhere near peformant enough for a game engine.


I only know PyGame and Panda3D, but both of those have their core written in C or C++, and use multiple threads. Python is just intended as the scripting language.


>I still have the feeling that those oddities are still a bit less bad than the ones of js.

I'm truly interested in hearing why does that matter?


just comparing two interpreted languages, i'm trying to say that I prefer one language over the other

i'm doing some whataboutism, sure


At least its not lua. My personal wtf was with lua ignoring variable scope.. its just a hashtable it all goes into it - https://onecompiler.com/lua/3zjsw5up3


I'm not sure what your code example is supposed to demonstrate... All I can say is that Lua uses lexical scoping for local variables (which need to be declared as "local").


where does it ignore the scope? use 'local' it stays in the function scope, without 'local' it's global, other scripting languages(e.g. bash) does the same.


Lol.

Funny how when JS is involved the frame is "WTF, this is why JS is a shit language and no one should use it".

Whereas with python is "WTF, Python is great and I don't understand it well enough".


To me, it's about how often you encounter these quirks when use the language "intuitively" (very subjective, I know).

I encountered dozens of "JS WTF" when learning without actively trying weird things. It's ultimately on me for not understanding the language better, no argument here, but it feels unintuitive.

And for Python, while I agree with most of cases listed in the repo to be indeed WTF (and a very good resource to learn it deeper!), I don't really encounter most of them naturally, other than the implicit string literal concatenation and default mutable arguments.


I disagree and think is the other way around.

The TFA shows several examples of code that someone learning the language would definitely hit. Most of the time JS "quirks" are due to code that is so complicated that the actual WTF is on why would someone design such code in the first place.


Really? To me most of JS quirks are things like `(1 == '1') == true`. Or why there are both `for .. of` and `for .. in`. Maybe you're just too familiar with the language to forgot about these things.


Just use '===' and all these "quirks" go away. Or learn the proper semantics for false-ish/true-ish (takes about 20 mins.) and you're on the other side.

The Python ones are abhorrent, here's a few of them:

* the 'is' operator behaving differently, even when called with operands of the exact same type?

* (from TFA) # This will print True or False depending on where you're invoking it (python shell / ipython / as a script) (WHAT?!)

* No multi-line lambdas because that makes the AST unparseable, literally. Then cover it up with some shit argument about how ackchually is more "pythonic" to only use one line functions, lol.


All languages have quirks. It's the degree of WTF-ness and how much they affect majority of its users that it becomes popular as a WTF-y language.

The more the users the more chances of this too.

Personally I find Python to be ergonomic and I've been less bitten by it. But you might get a totally different opinion if the same question is asked to a different group. This is a Python post so I guess majority would be pro-Python?


[flagged]


The license suggests you can do as you please with the project. Feel free to rewrite it to conform to your own sensibilities.


Conversely, vulgarities drive your point across stronger (given that you aren't using them every second sentence) Compare: "the project is being late" with "the project is getting fucking late", for example.

I don't really get the self-censorship though. Either don't use vulgarities, or actually use them, not this self-bowdlerising stuff.


Stand in front of any crowd and see how uncomfortable everyone gets as you speak like that.

Vulgarities in conversation and writing show a lack of class, culture and intelligence. Often your age is showing, too, as youth seems to think it's "no big deal".


Alternatively, culture and sensitivities are changing and you're not keeping up with them.


One can. Everyone can.

It happens that words have meaning and using "vulgarities" have its place and meaning too.

You may not agree, language is a dynamic thing, but even in a technical or professional settings, these words have their use.


Being vulgar for the sake of being vulgar is childish.


You are the one saying it is for the sake of being vulgar and not to convey a necessary strong emotion.

You can choose to cut yourself from this part of your humanity. That is your choice. But don't scowl at others as childish because they have real emotions to express.


One can express strong emotion without vulgarities in every language in multiple ways.


And there are also some emotions for which vulgarities are the right choice


> Please don't post shallow dismissals, especially of other people's work. A good critical comment teaches us something.

> Please don't pick the most provocative thing in an article or post to complain about in the thread. Find something interesting to respond to instead.


It has 33k stars and has been for for 6 years. Sounds like people don't care.


From 2010:

https://news.ycombinator.com/item?id=1121932

https://github.com/denysdovhan/wtfjs

My my, how HN sensibilities change over time. Hacker news, the place where the language police now rains supreme eh?


Some users have always complained about profanity on HN and it has always been a minority. I don't think sensibilities have changed much.

https://hn.algolia.com/?dateRange=all&page=0&prefix=true&que...




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: