> In some ways, the most damning line in the article is in the conclusion:
> Success was not achieved with every large language model considered, and effort was required to engineer the prompt.
This, and the couple if paragraphs that follow, are interesting in that what is presented is the challenf of getting a language model to learn a simple parity calculation from examples. That's a Mensa-style riddle which would be very impressive if it became a general capability purely from language inputs.
It reminds me of the neural data profiler I worked on that was based on a CNN. We talked a lot about what sort of features would go into it, but the one thing we agreed was that the CNN would struggle to learn the idea of "well-formed credit card number" which complies with
> Success was not achieved with every large language model considered, and effort was required to engineer the prompt.
This, and the couple if paragraphs that follow, are interesting in that what is presented is the challenf of getting a language model to learn a simple parity calculation from examples. That's a Mensa-style riddle which would be very impressive if it became a general capability purely from language inputs.