More

JoshuaDavid · 2025-09-18T18:19:36 1758219576

Thanks ChatGPT

JoshuaDavid · 2025-09-01T21:35:39 1756762539

> If you want to base your "ideas" of taxes (Do you own real estate?) on edge cases why not worry about eminent domain or property seizures without a warrant or charges being filed?

Particularly in the case of the latter example I would be pretty surprised to encounter someone in favor of both LVT and civil asset forfeiture. Are you sure this is a case of specific people having inconsistent policy preferences and not a case of a broad group containing people who hold incompatible views?

JoshuaDavid · 2025-08-13T10:01:19 1755079279

Mm, doughnuts. I'll take the flip side of that bet, since I don't think capturing the typing cadence for individual words would be all that helpful. I'd bet the typing cadences here are distinguishable from the cadence of normal English text (as might be collected by a malicious browser extension which vacuums up keystroke data on popular UGC sites).

JoshuaDavid · 2025-08-09T05:25:56 1754717156

It is striking how similar these answers are to each other, hitting the same points beat for beat in a slightly different tone.

JoshuaDavid · 2025-08-08T22:45:05 1754693105

Google employees collectively have a lot of talent.

bee_rider · 2025-08-09T02:01:57 1754704917

A truly astonishing amount of talent applied to… hosting emails very well, and losing the search battle against SEO spammers.

big_hacker · 2025-08-09T08:17:47 1754727467

Well, Search had no chance when the sites also make money from Google ads. Google fucked their Search by creating themselves incentives for bounce rate.

JoshuaDavid · 2025-08-05T19:32:54 1754422374

There are some patterns you can use that help a bit with this problem. Lowest hanging fruit is to tell the LLM that its tests should test only through public interfaces where possible. Next after that is to add a "check if any non-public interfaces were used in places where a public interface exposes the same functionality the not-yet-committed tests - if so, rewrite tests to use only publicly exposed interfaces" step to the workflow. You could likely also add linter rules, though sometimes you genuinely need to test something like error conditions that can't reasonably be tested only through public interfaces.

gspencley · 2025-08-05T22:10:41 1754431841

Oh don't get me wrong. I'm sure that an LLM can write a decent test that doesn't have the problems I described. The problem is that LLMs are making a preexisting problem much, MUCH worse.

That problem statement is:

- Not all tests add value

- Some tests can even create dis-value (ex: slow to run, thus increasing CI bills for the business without actually testing anything important)

- Few developers understand what good automated testing looks like

- Developers are incentivized to write tests just to satisfy code coverage metrics

- Therefore writing tests is a chore and an afterthought

- So they reach for an LLM because it solves what they perceive as a problem

- The tests run and pass, and they are completely oblivious to the anti-patterns just introduced and the problems those will create over time

- The LLMs are generating hundreds, if not thousands, of these problems

So yeah, the problem is 100% the developers who don't understand how to evaluate the output of a tool that they are using.

But unlike functional code, these tests are - in many cases - arguably creating disvalue for the business. At least the functional code is a) more likely to be reviewed and code quality problems addressed and b) even if not, it's still providing features for the end user and thus adding some value.

JoshuaDavid · 2025-08-05T19:25:17 1754421917

Are we using the same LLMs? I absolutely see cases of "hallucination" behavior when I'm invoking an LLM (usually sonnet 4) in a loop of "1 generate code, 2 run linter, 3 run tests, 4 goto 1 if 2 or 3 failed".

Usually, such a loop just works. In the cases where it doesn't, often it's because the LLM decided that it would be convenient if some method existed, and therefore that method exists, and then the LLM tries to call that method and fails in the linting step, decides that it is the linter that is wrong, and changes the linter configuration (or fails in the test step, and updates the tests). If in this loop I automatically revert all test and linter config changes before running tests, the LLM will receive the test output and report that the tests passed, and end the loop if it has control (or get caught in a failure spiral if the scaffold automatically continues until tests pass).

It's not an extremely common failure mode, as it generally only happens when you give the LLM a problem where it's both automatically verifiable and too hard for that LLM. But it does happen, and I do think "hallucination" is an adequate term for the phenomenon (though perhaps "confabulation" would be better).

Aside:

> I can't imagine an agent being given permission to iterate Terraform

Localstack is great and I have absolutely given an LLM free rein over terraform config pointed at localstack. It has generally worked fine and written the same tf I would have written, but much faster.

JoshuaDavid · 2025-08-05T04:15:44 1754367344

I think "Who, specifically, claims that [...]?" comes off as less condescending than "Who claims that [...]? Be specific." just by virtue of the latter using imperative language, which triggers a reflexive "you're not the boss of me" reaction.

john01dav · 2025-08-05T06:30:07 1754375407

The message is clear in both cases. It's easier to put aside these irrational reflexive reactions and think about whatever worth can be derived from the message than it is to carefully manage the emotions of varied readers whom you don't know. This is different from bring overtly inflammatory, although the lines for this are subjective.

JoshuaDavid · 2025-08-05T06:37:26 1754375846

Ultimately it's probably not a productive use of time to be commenting here at all from a strict EV perspective. Meaning that if you're posting here, you're probably getting something else out of it. The value of that "something else" determines how you should approach the problem of managing the gut reactions of your readers.

If someone asks for a better way to word something to reduce reader hostility to their point, I assume that they will be better off for knowing the answer to that question, and can decide for themselves whether they want to change their writing style or not - and, whether they do or do not, the effects of their writing will be more intentional.

comfysocks · 2025-08-05T16:03:39 1754409819

In the two cases, the meaning of the message may be the same, but the tone of the message is different. One tone invites further engagement, the other invites disengagement.

JoshuaDavid · 2025-08-04T16:53:27 1754326407

> What does it do better than other languages?

Shared nothing architecture. If you're using e.g. fastapi you can store some data in memory and that data will be available across requests, like so

    import uvicorn, fastapi
    
    app = fastapi.FastAPI()
    
    counter = {"value": 0}
    
    @app.post("/counter/increment")
    async def increment_counter():
        counter["value"] += 1
        return {"counter": counter["value"]}
    
    @app.post("/counter/decrement")
    async def decrement_counter():
        counter["value"] -= 1
        return {"counter": counter["value"]}
    
    @app.get("/counter")
    async def get_counter():
        return {"counter": counter["value"]}
    
    if __name__ == "__main__":
        import uvicorn
        uvicorn.run(app, host="0.0.0.0", port=9237)

This is often the fastest way to solve your immediate problem, at the cost of making everything harder to reason about. PHP persists nothing between requests, so all data that needs to persist between requests must be explicitly persisted to some specific external data store.

Non-php toolchains, of course, offer the same upsides if you hold them right. PHP is harder to hold wrong in this particular way, though, and in my experience the upside of eliminating that class of bug is shockingly large compared to how rarely I naively would have expected to see it in codebases written by experienced devs.

secstate · 2025-08-04T17:48:06 1754329686

I hadn't really thought about PHP through this lens. But it's so much a part of where it came from as a preprocessor for text. It was a first-class part the stateless design of the OG internet. Now everyone wants all things persisted all the time, and leads to crazy state problems.

eduardofcgo · 2025-08-04T20:17:07 1754338627

Also because it's a language for the web, and HTTP is stateless.

zygentoma · 2025-08-04T22:03:43 1754345023

But that's Python, no?

Edit: Oh, you showed an example against Python! Now I get it!

JoshuaDavid · 2025-08-04T04:19:57 1754281197

I'm not sure - a lot of the top comments are saying that this article is great and they learned a lot of new things. Which is great, as long as the things they learned are true things.