It's great that this contest has been running for so long and still produces interesting new approaches, but I wish there were more work being done in the field of underhanded code contests, e.g. [0].
One area of technology that seems to have continued the work of discovering underhanded techniques is the realm of cryptocurrency, specifically the "Solidity Underhanded Contest". The results for 2020 are here[1], which links to (spoiler alert) a great trick on line 65 here[2] (select the line character by character with a mouse to reveal it).
Great example of why Unicode and source code do not mix well... it reminds me of when I taught CS courses and had a few instances of very confused students who managed to somehow get one of those "invisible formatting characters" into their files. Fortunately, a hexdump --- or even just an "old school" ANSI-only editor --- reveals all.
One of the strange things I've encountered is bare \r (i.e. no accompanying \n) in the middle of a line from a windows-sourced file. Don't know how they got there, just that redshift didn't like it one bit.
I think how it happens with windows is one windows editor adds \r\n and then the file is edited in an editor that isn't properly aware of the windows line ending and the \n gets deleted. I've seen this happen before. It's especially likely to happen if the file has mixed line endings.
It’s quite a common problem with pasting commands from emails (or I guess some other webpages.)
Two spaces when writing an email get converted into at least 1 non-breaking space and these make bash sad. Unfortunately there isn’t really a good way to have bash treat them as white space which is annoying.
Compose Space Space on Free OSes. It’s useful sometimes, though I wish there was a default binding for zero-width space — I resorted to a non-breaking one to unbreak a link followed by an apostrophe yesterday.
I had a problem like this at work just before Xmas break. A script wa returning "command not found" for 'chmod' because of an invisible Unicode character. The guy who asked my help refused to believe it, untill I showed him a hex dump of the script, he got convinced when he saw 2 byted between the command and parameters.
One of my favorite programs was one where every single variable looked like 'K' but was different (there's K, Κ, K, 𐌊, 𝖪, К, Ꮶ, Ⲕ, etc.), and all of the strings were encoded in Roman numerals, e.g. ⅲⅰⅶⅺⅲⅶⅳⅰⅰⅳ.
I agree wholeheartedly- found about the Underhanded C Contest through an HN thread and enjoyed reading through the submissions every year. Sadly it seems to have not been run since 2015.
On the topic of printf("%n"), I remember hearing a fun story about that a while ago. It appears that in certain Mazda cars it was impossible to play the "99% invisible" podcast on the car stereo (and only that podcast), because attempting to do so would crash the software.
Oh, I loaded up my music on a USB stick for my 2018 Mazda. I ended up having to strip most of the vorbis comments or really strange things would happen. I'd see things like contents of other comments appear inside the title or artist field; the implementation must be buggy as all hell. I wrote a program that strips it down to just album, artist, and replaygain tags, and that seems to not cause too many problems.
My latest PL idea, inspired by Rust's "unsafe {}" blocks is "expressive {}" blocks.
The "base language" would be boring, pedestrian, optimizing for zero surprises. But inside an expressive{} block, you could have macros, operator overloading, DSLs, first-class continuations, you name it.
I had a similar idea upon seeing the power of incredibly terse languages like k and q (both still used in a lot of investment banks). The core language would be boring and imperative, for plumbing, but you'd also be able to write and evaluate array language expressions as a first class construct.
The former doesn't follow from the latter. For tools I use and places I want to work:
* Programming languages should be general.
* Programmers should be competent and disciplined enough not to misuse that generality.
I absolutely hate programming in languages like Java, designed for idiot programmers. I understand their place -- there are a lot of incompetent programmers, and we need to constrain them, and a lot of places with boring IT problems who won't be able to hire competent people -- but it's not something I'd ever want to touch. That sort of workplace and that sort of language would make me miserable.
To quote a man I consider to be both, very intelligent and wise:
> Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it.
I don't consider that man particularly wise. Clever has many different axes.
Compare debugging something like git or hg (clever, simple data structure) to cvs or svn (non-clever, standard data structure).
Systems like git and hg don't just do more; they're also far more debuggable thanks to clever. That's not an IOCCC type of cleverness, but a deep, deep clever.
Likewise, consider general-purpose data stores (such as a KVS or an RDBMS) compared to a one-off data structures mapping directly to your data. The design of the original RDBMS was hyper-clever.
A wise man knows when to be clever and when not to be clever. I don't know of any programming language designer who can make that determination for me; it's too domain-specific. If you can rely on programmers to not be clever for inane reasons, you can allow them to be clever for architecturally-critical reasons.
I do a lot of my work in Python these days. One of the really nice things Python did is take the major design patterns in Lisp/Scheme, which were super-powerful but quite unreadable, and gave visually-distinct ways of expressing those.
For example, major design patterns like maps and filters are done as list comprehensions, which are nice.
Ditto for decorators.
The key thing is that I still have general-purpose functional programming (except for tail recursion, which I dearly miss), but I use constructs which don't have explicit Python semantic quite rarely. Still, there are times when I can do something which will e.g. cut architectural complexity in half, and that's well worth a little bit of magic code. I'll usually try to do that in an isolated, well-documented file which has the magic. That's enabled by having a language which relies on programmers having discipline rather than handcuffs.
It's a lot cleaner than the architectural contortions I see for any use of Java beyond its target (Java has a nearly ideal set of expressiveness for making an inventory management system, CRM, store web site, or similar types of database-backed applications).
As a footnote, the place for the type of C code you gave is low-level programming. That's a disappearing market, as even microwaves can now afford RISC chips in the tens of megahertz with modern memory managements, but for hardware, you really care not just what happens, but how it happens. I know what machine code while(*a++) will translate to, and for those sorts of systems, that's nice.
describing it as a way of circumventing security measures that try to ensure that a particular control flow is followed in a compiled binary. In that case the security goal could be seen as forcing the compiled program to behave in a way close to what a human writer (or reader) would expect, while this method gets around that. So in that way, it could still be a problem, because the programmer appeared to write specialized program X but it can potentially be induced to behave like unrelated program Y at run time.
Looks like the pretty formatting is poplular. All the people that complain of Python layout should be jealous of this one: https://www.ioccc.org/2020/yang/prog.c
Speaking as a judge of the 27th competition, formatting doesn't give an entry as much of an advantage as you might think - what it does beyond the formatting has more influence. Code "quality" is always evaluated after pre-processing and restructuring. Credit is definitely given for compactness, functionality, uniqueness and the handling (exploitation) of boundary conditions.
Quite often there is something that "has not been seen before" - those entries do have a greater chance of being picked. The Ig-Nobels are seen as anti-Nobels. It is somewhat ironic that one of the few programming accolades you can be awarded is for writing code that will win an IOCCC award.
Of python: I am quite biased against the language - there are limited ways to speak or communicate it to a blind or deaf person. Python relies on the physical layout and structure to be semantically correct. (Python correctness does not survive whitespace or silence removal - which requires both working eyes and ears)
I've worked with people who are visually impaired many times, and even programmers who are visually impaired more than once, and yet I keep getting floored with things I had never considered as a sighted person, like the fact that silences are not equivalent in Python...
If you are visually impaired you may have to use a screen reader. If that screen reader cannot correctly 'say' the whitespace, it may be difficult to understand languages such as Python where indentation is significant.
How would a blind person code? They need a way to convert text on-screen to sound (or haptic feedback). Such a program can be easily used for C, because all the structure of C code is explicitly marked by its syntax - opening and closing braces, semicolons, macro beginnings and endings, etc. A C code reader can skip over any amount of whitespace, because none of it is semantic.
A Python screen reader would not have the same power. It would somehow have to communicate the significant whitespace to the user through sound. You cannot "remove the silence" when listening to a Python program, in the same way that you cannot strip out whitespace without changing the behaviour of the program.
In theory it should be possible to read out loud an "indent" and a "dedent" token whenever the indentation changes. That would be basically the same trick that Python's parser uses under the hood.
However, I don't know if there are any screen readers that have been taught to do that.
One area of technology that seems to have continued the work of discovering underhanded techniques is the realm of cryptocurrency, specifically the "Solidity Underhanded Contest". The results for 2020 are here[1], which links to (spoiler alert) a great trick on line 65 here[2] (select the line character by character with a mouse to reveal it).
[0] https://en.wikipedia.org/wiki/Underhanded_C_Contest
[1] https://blog.soliditylang.org/2020/12/03/solidity-underhande...
[2] https://github.com/ethereum/solidity-underhanded-contest/blo...