Spam filtering datasets

From ACL Wiki
Jump to navigation Jump to search
The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.
  • Enron-Spam A collection of datasets that contains spam messages, and ham messages from the Enron corpus. See this article for further details.
  • Ling-Spam A dataset that contains spam messages and messages from the Linguist list. See this article for further details.
  • PU datasets A collection of encrypted datasets that contain spam messages and ham messages from real users. See this paper and this report for further details.