I was playing around with a similar kind of problem trying to get it to decode Caesar cipher encoded text. I asked it to start by doing a frequency analysis of the ciphertext and for the most part it was right, but counted an extra instance of a letter. From there I tried making it loop through different shift values and made the stop condition finding a real word.
It was able to shift by a constant number successfully and even tried shifting both forward (+2) and backward (-2) looking for valid words without additional prompting. But it did not loop through every possibility and stopped having found a word that wasn't real. The interesting thing was that asking the model if the word it found was real with a follow-up question, it correctly identified that it gave an incorrect answer.
Part of why it failed to find a word is that it did an incorrect step going from EXXEG... to TAAAT... as a poor attempt of applying the frequency analysis. It understood that E shouldn't substitute with E and moved on to E->T, but the actual substitution failed.
The limitations of context memory and error checking are interesting and not something I expected from this model. The unprompted test of both positive and negative shift values shows some sort of system 2 thinking, but it's doesn't seem consistent.
Somehow it reminds me of the the problems people have counting the number of letter t's in a sentence or not seeing when someone writes "the" twice in a row like I did earlier in this sentence.
I’ve found its ability to lookup algorithms and explain them or generate code to be quite good. It generated a flutter component for me load an image asynchronously that was a great starting point.
It’s definitively a tool I’m willing to pay for to take the drudgery out of coding and I can see it being incredibly useful when learning a new language or framework.
It was able to shift by a constant number successfully and even tried shifting both forward (+2) and backward (-2) looking for valid words without additional prompting. But it did not loop through every possibility and stopped having found a word that wasn't real. The interesting thing was that asking the model if the word it found was real with a follow-up question, it correctly identified that it gave an incorrect answer.
Part of why it failed to find a word is that it did an incorrect step going from EXXEG... to TAAAT... as a poor attempt of applying the frequency analysis. It understood that E shouldn't substitute with E and moved on to E->T, but the actual substitution failed.
The limitations of context memory and error checking are interesting and not something I expected from this model. The unprompted test of both positive and negative shift values shows some sort of system 2 thinking, but it's doesn't seem consistent.
https://twitter.com/Knaikk/status/1600001061971849216