Things have improved considerably over the last 3 months. Claude with cursor.ai ...

imiric · 2024-11-17T20:35:56 1731875756

I haven't used cursor.ai, but Claude 3.5 Sonnet definitely has the issues I'm talking about. Maybe I'm not great at prompting, but this is far from an exact science. I always ask it specific things I need help with, making sure to provide sufficient detail, and don't ask it to produce mountains of code. I've had it generate code that not only hallucinates APIs, but has trivial bugs like referencing undefined variables. How this can scale beyond a few lines of code to produce an actually working application is beyond me. But apparently I'm in the minority here, since people are actually using these tools successfully for just that, so more power to them.

disgruntledphd2 · 2024-11-18T08:56:59 1731920219

I think it really depends on the language. It generates pretty crap but working python code, but even for SQL it generates really weird crummy code that often doesn't solve the problem.

I find it really helpful where I don't know a library very well but can assess if the output works.

More generally, I think you need to give it pretty constrained problems if you're working on anything relatively complicated.

kbaker · 2024-11-17T19:58:28 1731873508

Where the libraries are new/not known to the LLM yet, I just go find the most similar examples in the docs and chuck them in the context window too (easy to do with aider.) Then say 'fix it'. Does an incredible job.