maybe some bayesian learning algorithm disseminates the url you submitted and calculates a spam possibility based on the tokens (then the token 'youtube' might have a high bias for spam).
just a thought though, that I got after reading http://paulgraham.com/spam.html