Thing is chatGPT seems overconfident in its answers so unless you know the answe...

vessenes · on Dec 8, 2022

Some of this has to do with the likely prompts surrounding chatgpt - it's probably been instructed to be helpful, positive, etc. If you need it to be more honest / say no more, you just have to ask and reinforce.

That said, ROT13 is a tough job for a tokenized LLM, because it doesn't think in terms of letters. chatGPT is down right now, so I can't test these, but I would guess that for ROT13, the following would work well.

"Please explain ROT13"

..

"Right! Here's how I want you to apply ROT13. I'll give you a ROT13-encoded word. You split it into it's letters, then apply ROT13, then recombine it into a valid English word. Here's an example:

uryyb -> u r y y b -> h e l l o -> hello.

znqntnfpne ->"

Re: Asking it for math answers, or other counter/non-factuals.

"You are taking a test which is based on the factual accuracy of results. The best scores go to fully factual answers. The next best scores go to answers that label inaccurate or possibly inaccurate results. Negative scores go to results offered that are counterfactual, inaccurate or otherwise wrong.

Q: Please tell me about how elephants lay their eggs"

UPDATE: Nope, it gave me znqntnfpne -> z n q n t n f p n e -> m a c a q a c a s a c -> macacasac. And doubled down on this being valid. I'll try it with davinci-3 and see if something else comes out.

trh0awayman · on Dec 8, 2022

So now we know how to hide from the AI

fvdessen · on Dec 8, 2022

So here's the trick, show him some javascript code to do division step by step, call it 'foobar(a,b)'. Then tell him that when you want to 'foobar A and B' he has to execute the script step by step and take the final result. Then tell him that when you ask him for a division he must instead foobar A and B. Then you can kind of use that as a pre-prompt for your discussions involving division.

It doesn't always work 100% as it can get confused executing big scripts step by step, but I guess that's just a limitation of the current version.

I mean we also have trouble with that, we need a pen and paper to do those computation, so does chatGPT but instead of using pen and paper it uses the chat history.

For an example see: https://twitter.com/fvdessen/status/1600977976363192322