This morning I was using an LLM to develop some SQL queries against a database it had never seen before. I gave it a starting point, and outlined what I wanted to do. It proposed a solution, which was a bit wrong, mostly because I hadn't given it the full schema to work with. Small nudges and corrections, and we had something that worked. From there, I iterated and added more features to the outputs.
At many points, the code would have an error; to deal with this, I just supply the error message, as-is to the LLM, and it proposes a fix. Sometimes the fix works, and sometimes I have to intervene to push the fix in the right direction. It's OK - the whole process took a couple hours, and probably would have been a whole day if I were doing it on my own, since I usually only need to remember anything about SQL syntax once every year or three.
A key part of the workflow, imo, was that we were working in the medium of the actual code. If the code is broken, we get an error, and can iterate. Asking for opinions doesn't really help...
I often wonder if people who report that LLMs are useless for code haven't cracked the fact that you need to to have a conversation with it - expecting a perfect result after your first prompt is setting it up for failure, the real test is if you can get to a working solution after iterating with it for a few rounds.
As someone who has finally found a way to increase productivity by adding some AI, my lesson has sort of been the opposite. If the initial response after you've provided the relevant context isn't obviously useful: give up. Maybe start over with slightly different context. A conversation after a bad result won't provide any signal you can do anything with, there is no understanding you can help improve.
It will happily spin forever responding in whatever tone is most directly relevant to your last message: provide an error and it will suggest you change something (it may even be correct every once in a while!), suggest a change and it'll tell you you're obviously right, suggest the opposite and you will be right again, ask if you've hit a dead end and yeah, here's why. You will not learn anything or get anywhere.
A conversation will only be useful if the response you got just needs tweaks. If you can't tell what it needs feel free to let it spin a few times, but expect to be disappointed. Use it for code you can fully test without much effort, actual test code often works well. Then a brief conversation will be useful.
Because once you get good at using LLMs you can write it with 5 rounds with an LLM in way less time than it would have taken you to type out the whole thing yourself, even if you got it exactly right first time coding it by hand.
Most of the code in there is directly copied and pasted in from https://claude.ai or https://chatgpt.com - often using Claude Artifacts to try it out first.
Some changes are made in VS Code using GitHub Copilot
If you do a basic query to GPT-4o every ten seconds it uses a blistering... hundred watts or so. More for long inputs, less when you're not using it that rapidly.
I know. That's why I've consistently said that LLMs give me a 2-5x productivity boost on the portion of my job which involves typing code into a computer... which is only about 10% of what I do. (One recent example: https://simonwillison.net/2024/Sep/10/software-misadventures... )
(I get boosts from LLMs to a bunch of activities too, like researching and planning, but those are less obvious than the coding acceleration.)
> That's why I've consistently said that LLMs give me a 2-5x productivity boost on the portion of my job which involves typing code into a computer... which is only about 10% of what I do
This explains it then. You aren't a software developer
You get a productivity boost from LLMs when writing code because it's not something you actually do very much
That makes sense
I write code for probably between 50-80% of any given week, which is pretty typical for any software dev I've ever worked with at any company I've ever worked at
So we're not really the same. It's no wonder LLMs help you, you code so little that you're constantly rusty
I very much doubt you spend 80% of your working time actively typing code into a computer.
My other activities include:
- Researching code. This is a LOT of my time - reading my own code, reading other code, reading through documentation, searching for useful libraries to use, evaluating if those libraries are any good.
- Exploratory coding in things like Jupyter notebooks, Firefox developer tools etc. I guess you could call this "coding time", but I don't consider it part of that 10% I mentioned earlier.
- Talking to people about the code I'm about to write (or the code I've just written).
- Filing issues, or updating issues with comments.
- Writing documentation for my code.
- Straight up thinking about code. I do a lot of that while walking the dog.
- Staying up-to-date on what's new in my industry.
- Arguing with people about whether or not LLMs are useful on Hacker News.
You must not be learning very many new things then if you can't see a benefit to using an LLM. Sure, for the normal crud day-to-day type stuff, there is no need for an LLM. But when you are thrown into a new project, with new tools, new code, maybe a new language, new libraries, etc., then having an LLM is a huge benefit. In this situation, there is no way that you are going to be faster than an LLM.
Sure, it often spits out incomplete, non-ideal, or plain wrong answers, but that's where having SWE experience comes in to play to recognize it
> But when you are thrown into a new project, with new tools, new code, maybe a new language, new libraries, etc., then having an LLM is a huge benefit. In this situation, there is no way that you are going to be faster than an LLM.
In the middle of this thought, you changed the context from "learning new things" to "not being faster than an LLM"
It's easy to guess why. When you use the LLM you may be productive quicker, but I don't think you can argue that you are really learning anything
But yes, you're right. I don't learn new things from scratch very often, because I'm not changing contexts that frequently.
I want to be someone who had 10 years of experience in my domain, not 1 year of experience repeated 10 times, which means I cannot be starting over with new frameworks, new languages and such over and over
Exactly! I learn all kinds of things besides coding-related things, so I don't see how it's any different. ChatGPT 4o does an especially good job of walking thru the generated code to explain what it is doing. And, you can always ask for further clarification. If a coder is generating code but not learning anything, they are either doing something very mundane or they are being lazy and just copy/pasting without any thought--which is also a little dangerous, honestly.
It really depends on what you're trying to achieve.
I was trying to prototype a system and created a one-pager describing the main features, objectives, and restrictions. This took me about 45 minutes.
Then I feed it into Claude and asked to develop said system. It spent the next 15 minutes outputting file after file.
Then I ran "npm install" followed by "npm run" and got a "fully" (API was mocked) functional, mobile-friendly, and well documented system in just an hour of my time.
It'd have taken me an entire day of work to reach the same point.
Yeah nah. The endless loop of useless suggestions or ”solutions” is very easily achiavable and common, at least on my use cases, not matter how much you iterate with it. Iterating gets counter-productive pretty fast, imo. (Using 4o).
When I use Claude to iterate/troubleshoot I do it in a project and in multiple chats. So if I test something and it throws and error or gives an unexpected result I’ll start a new chat to deal with that problem, correct the code, update that in the project, then go back to my main thread and say “I’ve update this” and provide it the file, “now let’s do this”. When I started doing this it massively reduced the LLM getting lost or going off on weird quests. Iteration in side chats, regroup in the main thread. And then possibly another overarching “this is what I want to achieve” thread where I update it on the progress and ask what we should do next.
I have been thinking about this a lot recently. I have a colleague who simply can’t use LLMs for this reason - he expects them to work like a logical and precise machine, and finds interacting with them frustrating, weird and uncomfortable.
However, he has a very black and white approach to things and he also finds interacting with a lot of humans frustrating, weird and uncomfortable.
The more conversations I see about LLMs the more I’m beginning to feel that “LLM-whispering” is a soft skill that some people find very natural and can excel at, while others find it completely foreign, confusing and frustrating.
It really requires self-discipline to ignore the enthusiasm of the LLM as a signal for whether you are moving in the direction of a solution. I blame myself for lazy prompting, but have a hard time not just jumping in with a quick project, hoping the LLM can get somewhere with it, and not attempt things that are impossible, etc.
> OK - the whole process took a couple hours, and probably would have been a whole day if I were doing it on my own, since I usually only need to remember anything about SQL syntax once every year or three
If you have any reasonable understanding of SQL, I guarantee you could brush up on it and write it yourself in less than a couple of hours unless you're trying to do something very complex
Obviously to a mega super genius like yourself an LLM is useless. But perhaps you can consider that others may actually benefit from LLMs, even if you’re way too talented to ever see a benefit?
You might also consider that you may be over-indexing on your own capabilities rather than evaluating the LLM’s capabilities.
Lets say an llm is only 25% as good as you but is 10% the cost. Surely you’d acknowledge there may be tasks that are better outsourced to the llm than to you, strictly from an ROI perspective?
It seems like your claim is that since you’re better than LLMs, LLMs are useless. But I think you need to consider the broader market for LLMs, even if you aren’t the target customer.
Knowing SQL isn't being a "mega super genius" or "way talented". SQL is flawed, but being hard to learn is not among its flaws. It's designed for untalented COBOL mainframe programmers on the theory that Codd's relational algebra and relational calculus would be too hard for them and prevent the adoption of relational databases.
However, whether SQL is "trivial to write by hand" very much depends on exactly what you are trying to do with it.
Sure, I could do that. But I would learn where to put my join statements relative to the where statements, and then forget it again in a month because I have lots of other tihngs that I actually need to know on a daily basis. I can easily outsource the boilerplate to the LLM and get to a reasonable starting place for free.
Think of it as managing cognitive load. Wandering off to relearn SQL boilerplate is a distraction from my medium-term goal.
edit: I also believe I'm less likely to get a really dumb 'gotcha' if I start from the LLM rather than cobbling together knowledge from some random docs.
If you don’t take care to understand what the LLM outputs, how can you be confident that it works in the general case, edge cases and all? Most of the time that I spend as a software engineer is reasoning about the code and its logic to convince myself it will do the right thing in all states and for all inputs. That’s not something that can be offloaded to an LLM. In the SQL case, that means actually understanding the semantics and nuances of the specific SQL dialect.
That makes sense, and from what I’ve heard this sort of simple quick prototyping is where LLM coding works well. The problem with my case was I’m working with multiple large code bases, and couldn’t pinpoint the problem to a specific line, or even file. So I wasn’t gonna just copy multiple git repos into the chat
(The details: I was working with running a Bayesian sampler across multiple compute nodes with MPI. There seemed to be a pathological interaction between the code and MPI where things looked like they were working, but never actually progressed.)
I wonder if it breaks like this: people who don't know how to code find LLMs very helpful and don't realize where they are wrong. People who do know immediately see all the things they get wrong and they just give up and say "I'll do it myself".
At many points, the code would have an error; to deal with this, I just supply the error message, as-is to the LLM, and it proposes a fix. Sometimes the fix works, and sometimes I have to intervene to push the fix in the right direction. It's OK - the whole process took a couple hours, and probably would have been a whole day if I were doing it on my own, since I usually only need to remember anything about SQL syntax once every year or three.
A key part of the workflow, imo, was that we were working in the medium of the actual code. If the code is broken, we get an error, and can iterate. Asking for opinions doesn't really help...