Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

There are a lot of inconsistencies like that.

- (2 web_search and 1 web_fetch)

- (3 web searches and 1 web fetch)

- (5 web_search calls + web_fetch)

which makes me wonder what's on purpose, empirical, or if they just let each team add something and collect some stats after a month.



I’ve noticed in my own prompt-writing that goes into code bases that it’s basically just programming, but… without any kind of consistency-checking, and with terrible refactoring tools. I find myself doing stuff like this all the time by accident.

One of many reasons I find the tech something to be avoided unless absolutely necessary.


wdym by refactoring in this context?

& what do you feel is missing in consistency checking? wrt input vs output or something else?


> wdym by refactoring in this context?

The main trouble is if you find that a different term produces better output, and use that term a lot (potentially across multiple prompts), but don't want to change every case of it, or use a repeated pattern with some variation that and need to change them to a different pattern.

You can of course apply an LLM to these problems (what else are you going to do? Find-n-replace and regex are better than nothing, but not awesome) but there's always the risk of them mangling things in odd and hard-to-spot ways.

Templating can help, sometimes, but you may have a lot of text before you spot places you could usefully add placeholders.

Writing prompts is just a weird form of programming, and has a lot of the same problems, but is hampered in use of traditional programming tools and techniques by the language.

> & what do you feel is missing in consistency checking? wrt input vs output or something else?

Well, sort of—it does suck that the stuff's basically impossible to unit-test or to develop as units, all you can do is test entire prompts. But what I was thinking of was terminology consistency. Your editor won't red-underline if you use a synonym when you'd prefer to use the same term in all cases, like it would if you tried to use the wrong function name. It won't produce a type error if you if you've chosen a term or turn of phrase that's more ambiguous than some alternative. That kind of thing.


It feels like this prompt is a "stone soup" of different contributions, wildly varying in tone and formality.


...This also seems to me like the kind of thing that might happen if an AI was mostly regurgitating text but making small changes.

How confident are we that this system prompt is accurate?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: