Difference between revisions of "Spam filtering datasets"

From ACL Wiki
Jump to navigation Jump to search
(Added three datasets related to spam filtering.)
 
(No difference)

Latest revision as of 09:07, 19 November 2006

  • Enron-Spam A collection of datasets that contains spam messages, and ham messages from the Enron corpus. See this article for further details.
  • Ling-Spam A dataset that contains spam messages and messages from the Linguist list. See this article for further details.
  • PU datasets A collection of encrypted datasets that contain spam messages and ham messages from real users. See this paper and this report for further details.