Hacker News new | past | comments | ask | show | jobs | submit login

Mine does. And I wrote it myself.



You could use a Bloom filter to save on memory.

And perhaps empty it once a year, if necessary. (Ie so that replies to emails older than a year don't rank better than random emails.)


I just throw all the subject lines into a database. The data volume is not even remotely close to the limits of a modern DB.


Oh, I was thinking of how eg GMail could do it.

It depends a bit on volume of spam compared to legitimate emails. GMail knows about all of your past outgoing messages anyway, but the bloom filter would allow to quickly drop spam.

Google is already using Bloom filters to eg check locally in one service whether a cache in a different service is likely to be able to answer a query.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: