I feel like submodules are one of Git's most misused features. They're intended as a method of pinning read-only upstream Git dependencies. And when used for that purpose, they're good at what they do.
I think that people mostly get a bad taste in their mouths because they try to use submodules for building multi-repo workspaces where a developer might need to commit in some/all of the repos. They're a bad fit for that problem, but it's mostly because that's not what they were designed to do.
I'd love to see the jj team tackle case #2, personally. I bet they'd do a pretty good job of it.
> They're intended as a method of pinning read-only upstream Git dependencies. And when used for that purpose, they're good at what they do.
No they are not. In theory they could be good, but the actual implementation falls down in ... let me count the ways:
1. Transitive dependencies. I sure do love that my company's submodule-based repo has 12 copies of one of our dependencies.
2. Checkouts don't work any more. You can't simply `git switch <branch>`, especially if that other branch has a different set of submodules. And good fucking luck if one branch has a submodule and another branch has a not-submodule in the same location.
3. They don't work with worktrees. In theory... maybe. In practice, the documentation says not to try and in my experience it is right!
4. The submodule URLs are now baked into the git repo. This means you can't mirror the repo anymore easily. I've even had cases where I couldn't even clone the repo because the authors had used `ssh://` URLs which required permissions I didn't have. It's insane that the authentication method gets baked into the repo. I have no idea why they implemented it like this.
5. The tooling experience is just way worse. Want to see a diff of everything you've changed? Well you can't. If you've changed anything in a submodule you just get a hash difference, or at best a list of commits (which is better but it's not even the default!).
Before you instinctively reach for the "well obviously it must work like that" part of your brain, take a moment to think if it should work like this. I can think of several ways to do it better (beyond just making the implementation less buggy).
I read that "read-only" as essentially a "lockfile".
Which, this is exactly how pinning dependencies works. However if you are mutating your dependencies frequently and want the reference to them to change at the same time, this is the big pain with submodules- gotta do both yourselves. Not to mention there are now logistical problems to answer as this cannot happen atomically in all scenarios, let alone automatically.
That doesn't help if you actually need those repositories to exist separately.
For instance, consider the problem of having an external Open Source project your company maintains or heavily contributes to, and which also has common libraries shared with internal non-public work. Or, the same problem applies if you have a downstream Open Source project that needs to incorporate and vendor an upstream one but easily contribute changes upstream.
Some folks do this by generating the external repo as a filtered version of an internal monorepo, but that's awful in so many ways, and breaks the ability to properly treat the external repo as a first-class thing that can merge PRs normally. It leads to workflows where people submitting PRs feel like their work gets absorbed internally and maybe eventually spit back out the other end, rather than just being merged.
Is the pain with publishing a subsetted version of the internal monorepo anything but a tooling limitation where things like pushing into that that subsetted version, and merging changes made on the subsetted version, aren't automatic because our tools don't natively understand subsets?
It would require forge integration, but I'd like a world where I could make a PR to `company/open-source-subdir` and the company could merge that PR and that was that without any extra steps because open-source-subdir is just a publicly published subset of the `company` repo.
No, it's not just a tooling limitation. Or, at least, not one solvable just by having forges expose public subsets of private repos. That might partially solve the simplest case of `company/open-source-subdir`, if the company trusts the forge tooling to handle subsetting and to not expose more than they want, but it doesn't solve the more general problem.
Consider the case where the repositories are owned by different entities, for instance, or have different governance. For instance, Project X needs to vendor Project Y, and have a few downstream patches that they're working on upstreaming.
Right now, the two major options include:
- Use a submodule. Experience all the pain of submodules.
- Use some tooling to fold the repo into your own repo. Extract commits from it when you want to upstream. Experience all the pain of not having the original commits and the ability to easily pull/merge/cherry-pick/rebase, plus pain when you make changes across that repo and other code.
Does Claude Code use a different model then Claude.ai? Because Sonnet 4 and Opus 4 routinely get things wrong for me. Both of them have sent me on wild goose chases, where they confidently claimed "X is happening" about my code but were 100% wrong. They also hallucinated APIs, and just got a lot of details wrong in general.
The problem-space I was exploring was libusb and Python, and I used ChatGPT and also Claude.ai to help debug some issues and flesh out some skeleton code. Claude's output was almost universally wrong. ChatGPT got a few things wrong, but was in general a lot closer to the truth.
AI might be coming for our jobs eventually, but it won't be Claude.ai.
The reason that claude code is “good” is because it can run tests, compile the code, run a linter, etc. If you actually pay attention to what it’s doing, at least in my experience, it constantly fucks up, but can sort of correct itself by taking feedback from outside tools. Eventually it proclaims “Perfect!” (which annoys me to no end), and spits out code that at least looks like it satisfies what you asked for. Then if you just ignore the tests that mock all the useful behaviors out, the amateur hour mistakes in data access patterns, and the security vulnerabilities, it’s amazing!
> Eventually it proclaims “Perfect!” (which annoys me to no end),
This has done wonders for me:
# User interaction
- Avoid sycophancy.
- Do what has been asked; nothing more, nothing less.
- If you are asked to make a change, summarize the change but do not explain its benefits.
- Be concise in phrasing but not to the point of omission.
You're right, but you can actually improve it pretty dramatically with sub agents. Once you get into a groove with sub agents, it really makes a big difference.
It's also worth noting - the Bible isn't a single book, and it never was. It's 66 or 73 separate books (depending on the flavor of Christianity), bound together in a single binding. Much more of an anthology than a single book. The books were separated in time, authorship, culture, and even language. Never intended to be taken together as a single document.
Obviously not the intent of the original authors or the people who decided to compile these documents into an authoritative anthology.
Not convinced "all over the world" as a fair representation. Biblical literalists treat it as a single work, and they are mostly American or follow American leadership and tradition.
They also usually pick a particular version of "the Bible". Martin Luther's version, which was the Catholic version with some bits taken out. They also usually regard the Catholics who compiled that particular version as heretics. They also usually prefer a particular 17th century translation (so missing a lot of more recent scholarship and discoveries), and sometimes even a particular late 19th century (I think?) edition of that translation.
The preference for the KJV is quite amusing given it means social conservatives who presumably vote Republican are relying on the authority of a gay monarch.
Or "the earliest copier". Someone writes a story from an oral tradition down. It gets copied. Someone else copies the copy, and adapts the story to fit his own narrative.
The landing team's job is to "land" patches into upstream. They take Qualcomm code and spin straw into gold until it's eventually good enough to contribute to projects like the Linux kernel.
Having read a lot of Qualcomm code myself, I don't envy their job.
There's a certain irony in the idea that Hawaii has interstates, given that it's an archipelago. It's great that H1, H2, and H3 exist, and Hawaii deserves the same road funding as any other state. But there's some lesson about naming conventions, or emergent properties, or maybe something else to be had here for sure.
The interstate highway system is actually made up of Interstate and Defense Highways. So all the "interstates" in Hawaii are actually Defense Highways that connect Pearl Harbor with other military bases on Oahu.
- The H-1 goes from Barbers Point to Pearl Harbor to Diamond Head.
- The H-2 connects Pearl Harbor with Schofield Barracks.
- The H-3 connects Pearl Harbor with MCBH (Marine Corps Base Hawaii) at Kaneohe.
I presume in Alaska's case, at least, it's a funding technicality. As you said, the roads are not built to interstate standards (with a few exceptions around Fairbanks and Anchorage, maybe). For example, without prior knowledge, no one is going to guess that the AlCan is an interstate, as there aren't even any signs indicating such.
I'm not sure about that. For whatever reason, I've noticed that my brain has a hard time holding onto ideas from LLM-written documentation. Maybe because LLMs generate the mathematically lowest-energy thing that they can.
I'd take poor grammar and interesting ideas over clear grammar devoid of real content any day of the week.
Assuming that your ability to remember the content isn't a result of differences in the substance of the content, in my experience the stylistic issue can be addressed with thoughtful training/prompting and lots of Do/Don't examples.
It helps if your technical writers already adhere to a voice/tone guide, which can be pretty easily adapted/extended for automated documentation generation. If one doesn't exist, you'll definitely want to create that first. Some good examples:
That’s understandable - the rule of “garbage in, garbage out” certainly still applies. I find that many engineers are capable of gathering the right requirements and content, but struggle with the polish/finish that makes docs more consumable - where LLMs can shine.
When talking with recruiters, I usually decline to state the first number politely - saying something like "salary isn't the only reason why I'd take a job, and I always look at the whole package as well as the role. I'm sure your offer will be competitive."
On forms where I have to put a range, I put $1.00.
I just gave that a read. Good doc overall. There are a few items I disagree with:
- Cargo-culted use of -o pipefail. Pipefail has its uses, but it breaks one of the most common things people do in a pipeline: filter the output with grep. Add it on a per-recipe basis instead.
- Marking non-file targets as .PHONY. This is strictly correct, but it's usually not necessary. I think it adds unneeded verbosity to the Makefile, especially if you have a lot of targets. Better to add it on an as-needed basis IMO.
- Recipes with multiple output files. Use of dummyfiles/flagfiles used to be the standard if a pattern-rule wasn't the right fit. But as of GNU Make 4.3 (shipping in Ubuntu 22.04 LTS), there is native support for grouped targets. Check it out here: https://www.gnu.org/software/make/manual/html_node/Multiple-...
I think that people mostly get a bad taste in their mouths because they try to use submodules for building multi-repo workspaces where a developer might need to commit in some/all of the repos. They're a bad fit for that problem, but it's mostly because that's not what they were designed to do.
I'd love to see the jj team tackle case #2, personally. I bet they'd do a pretty good job of it.
reply