Training Gmail: Sit. Stay. Good Gmail.

At the end of my presentation on Bayes’ Theorem at BarCampOrlando, there was some Q&A time.

I was asked a question about automatically training a spam filter, and I got into explaining how Bayesian filtering isn’t a “spam test” per-se. The simplest way to think about Bayesian filtering is that you sort email you’ve already received into two piles: email you don’t consider spam, and email you do consider spam. Then, through the magic of Bayes, new emails automatically get put in one of the two piles, based on which pile the new email most resembles.

Then I mentioned — as a bit of an oddity — that you could theoretically train Gmail to deliver nothing but Viagra spam to your Inbox. “Heh,” I thought, “that would be a neat trick.”


I’m trying to sign up for as many shady email newsletters and web forms as possible. I’m posting the email address here, as a fully-qualified mailto: link. Anything I can to start getting spam as fast as possible.  I’m planning on marking everything that mentions Viagra as “not spam”, even “1337-speak” emails like “V1agra”. Depending on how it goes, I hope to post results here.

(On a side note: I wonder how the IT dept at Pfizer handle spam. They must get a ton of false negatives for Viagra spam.)