Quite an interesting point I hadn't considered at all. On the one hand I'm wondering: what's your suggestion on how to address this with minimal changes to my criteria? On the other hand I'm wondering: well, if the analog of this is that AGI might get solved with more AGI, then that's only going to make me less likely to be worried in the first place!
I had issues with just taking "current spam data, not trained on that particular data":
- It allows trainers to hard-code future rules based on their experience of what has passed through past filters, even if their model isn't technically trained on this dataset
- You might get similar emails sent to different mailboxes and the instances not included would still be allowed (and I don't really want to go down the rabbithole of defining a similarity metric between emails)
- I think I want to allow spammers to evolve their capabilities at least using current techniques, which we all presumably agree is "less than AGI". After all, intelligence implies adapting to a dynamic environment. It's not really going to feel like AGI (and certainly not going to make me worry) if it looks like AGI is trivial to outsmart by humans or less-than-AGI techniques.