Wow, very interesting comment, thanks! Wouldn't it make sense to build (and maintain) a kind of "official" reference of all pure spam domains? Or does this list already exist?
Well every crawler has to have this list, the Blekko crawler tries to keep these pages out of the index (with varying levels of success). But its not particularly useful for non-crawlers, and since every crawler will have a way of evaluating hosts (possibly uniquely) it isn't really transportable.
That said, if you have ever wondered why domains that used to have web sites on them suddenly become huge spam havens, it is because spammers buy up the domain as soon as it expires and try to exploit its previous reputation as a non-spam site, to push link authority into some (generally Google's) crawl.