More

quuxplusone · 2025-07-31T04:37:08 1753936628

I suspect you can edit it back right now, just like you can edit the title back if HN changes it. The automatic stuff runs only once on initial submit (AFAIK).

akkartik · 2025-07-31T04:40:25 1753936825

Good to know. I can't edit anymore. Not sure if I could edit when I first responded. It was 4 hours later.

quuxplusone · 2025-07-29T18:54:58 1753815298

"500 code samples generated by Magistral-24B" — So you didn't use real code?

The paper is totally mum on how "descriptive" names (e.g. process_user_input) differ from "snake_case" names (e.g. process_user_input).

The actual question here is not about the model but merely about the tokenizer: is it the case that e.g. process_user_input encodes into 5 tokens, ProcessUserInput into 3, and calcpay into 1? If you don't break down the problem into simple objective questions like this, you'll never produce anything worth reading.

ijk · 2025-07-29T19:09:09 1753816149

True - though in the actual case of your examples, calcpay, process_user_input, and ProcessUserInput all encode into exactly 3 tokens with GPT-4.

Which is the exact kind of information that you want to know.

In practice, I'd expect the performance difference to be relatively minimal, as input tokens tends to quickly get aggregated into more general concepts. But that's the kind of question that's worth getting metrics on: my intuition suggests one answer, but do the numbers actually hold up when you actually measure it?

quuxplusone · 2025-07-29T23:47:33 1753832853

Awesome! You should have written this blog post instead of that guy. :)

quuxplusone · 2025-07-27T20:00:58 1753646458

The same outfit, "Ohio Gambling Recovery LLC", was covered by Matt Levine in June: https://archive.is/4eSKP

quuxplusone · 2025-07-27T04:12:29 1753589549

Original title: "How I Review GitHub PRs." The autoshortener strikes again! Remember to review your post after submitting, to undo the autoshortener's changes.

quuxplusone · 2025-07-19T20:52:11 1752958331

Python is my second language (after C++) and for me the surprising thing here is not "chained comparisons are weird" but rather "`in` is a comparison operator."

So for example `1 in [1,2] in [[1,2],[3,4]]` is True... but off the top of my head I don't see any legitimate use-case for that facility. Intuitively, I'd think a "comparison operator" should be homogeneous — should take the same type on LHS and RHS. So, like, "is-subset-of" could sanely be a comparison operator, but `in` can't be.

zahlman · 2025-07-19T21:22:54 1752960174

>and for me the surprising thing here is not "chained comparisons are weird" but rather "`in` is a comparison operator."

Python documentation calls it a "comparison operator" (https://docs.python.org/3/reference/expressions.html#compari...) but a broader, better term is "relational operator". It expresses a possible relationship between two arguments, and evaluates whether those arguments have that relationship. In this case, "containment" (which intuitively encompasses set membership and substrings).

A finer distinction could have been made, arguably, but it's far too late to change now. (And everyone is traumatized by the 2->3 transition.)

quuxplusone · 2025-07-19T23:28:01 1752967681

Ah, `in` for strings satisfies my intuition re homogeneity. So I guess that makes sense enough (although if I ran the zoo I wouldn't have done it!). Thanks!

quuxplusone · 2025-07-13T13:27:45 1752413265

The jumping events do have different "strokes": long jump, (standing) broad jump, triple jump, possibly-etc. As far as I know, there is no generalized "transport yourself X distance without touching the ground" event. (Although I could be wrong.)

quuxplusone · 2025-07-12T14:54:16 1752332056

Very cool! Reminds me of Tim Burton's "Planet of the Apes" (2001), which did quadrupedal running with practical effects — harnesses, towed treadmills, all sorts of tricks — i.e., cheating, from the POV of this thread. :)

"Behind the Scenes of Tim Burton's Planet of the Apes": https://www.youtube.com/watch?v=KighzjHkZtY&t=803s "Ape School" starts at 9m35s. Quadrupedal running starts at 13m23s.

quuxplusone · 2025-07-12T14:43:58 1752331438

Ryuta Kinugasa, Yoshiyuki Usami. "How Fast Can a Human Run? Bipedal vs. Quadrupedal Running." Frontiers of Bioengineering and Biotechnology 4:56 (June 2016).

That looks remarkably like an April Fool's article released at the wrong time of year. The second-to-last paragraph is where they reveal the joke to anyone who wasn't already in on it:

> This study has limitations. Although statistical models are significantly related to mathematical formula [sic], the use of a statistical model to accurately predict future athletic performance is challenging (Hilbe, 2008). Fitted linear models should be treated with some caution. The use of linear regression for world record modeling would yield a continued decline that would eventually become negative, thus suggesting that update of world records can be continued until 0 s. It must also be noted that quadrupedal world records did not exist before 2008. This relatively recent involvement [sic] of quadrupedal running results in a somewhat tenuous comparison of world record times. Therefore, despite a high coefficient of determination, a large diverging confidence interval was found.—

—and then right back into it—

> —The 95% confidence intervals [sic] indicates that projected intersects could occur as early as in 2032 (9.238 s) or as late as 2076 (9.341 s).

A "rebuttal paper" might accept their major premise (i.e. feasibility of "a statistical model to accurately predict future athletic performance") but argue that rather than fitting a straight line (linear regression), we should fit an exponential decay curve (exponential regression). In an appendix, we'd try fitting a hyperbola (y = K1/(x-X0) + K2), taking X0 for quadrupedal running at 2008 and X0 for bipedal running anywhere from 2 million to 10 million years ago.

In an alternative "experimentalist approach," the rebuttal paper's author would actually run 100m himself, first on two legs and then on four; plot these as an additional data point (with x=2025) in each set; and fit a polynomial to that data. This would likely change the conclusion quite drastically. ;)

quuxplusone · 2025-07-12T12:27:40 1752323260

FWIW, I saw that the title was false (after all, Jank and C++ are two different things), but I assumed it was playing on the snowclone "Are we _X_ yet?" and therefore the blog post was going to be explaining why the answer to "Is Jank C++ yet?" should be "Yes, Jank is C++ now."

quuxplusone · 2025-07-10T17:23:42 1752168222

(Blog author here.) Nice find! I'll try to incorporate that into the post at some point. The 1951 first edition is also on archive.org (borrowable with a free login account): https://archive.org/details/preparationofpro0000maur/page/32...

I agree, it looks like this 1951 source is using "call in" to mean "invoke" — the actual transfer of control — as opposed to "load" or "link in." Which means this 1951 source agrees with Sarbacher (1959), and is causing me right now to second-guess my interpretation of the MANIAC II (1956) and Fortran II (1958) sources — could it be that they were also using "call in" to mean "invoke" rather than "indicate a dependency on"? Did I exoticize the past too much, by assuming the preposition in "call in" must be doing some work, meaning-wise?