Schematizing a Global SPAM Indicative Probability NIKOLAOS KORFIATIS * MARIOS POULOS † SOZON PAPAVLASSOPOULOS † Department of Management Science and Technology Athens University of Economics and Business * Athens, Greece Department of Archive and Library Sciences Ionian University † Corfu, Greece Abstract In this paper we propose a middleware infrastructure to address the problem of filtering unsolicitated mail messages (known as SPAM). In our approach we use Bayesian Classifications of SPAM messages built upon categorization models that map a probability to a word using text analysis not only to unsolicitated mails but also to legitimate mail messages, making easier to extract a cumulate inference about the nature of the e-mail message. Our proposed architecture is based on the extension of these models using the advances of Collaborative Filtering Methods expressed via Peer-to-peer networks will help to built more effective and accurate anti-spam filters. Key-Words : e-mail, SPAM , Privacy, Peer-to-peer, Bayesian Classifiers 1 Introduction SPAM [1] also known as mass commercial unsolici- tated email, is a fast growing phenomenon to all levels of internet users. Varying from end users to large en- terprises such us Internet Service Providers (ISP’s), SPAM is the most usual type of email that a typi- cal internet user receives every day. Socio-Technical aspects of SPAM vary from bandwidth costs to se- curity and privacy manners. Furthermore the devel- opment of sophisticated types of software crawlers whichmakes easier for SPAMers to acquire the email addresses from people who have made them public via a website or a participating to an internet com- munity such us the USENET news, poses a threat to the use of e-mail as the primary mean for com- puter mediated communication. SPAM protection currently has two approaches, the first is the legal measures approach which is now being applied to US and EU as a way to punish senders that are responsi- ble for a large number of unwanted emails been sent to internet users, making a violation of their privacy rights. The other side of the coin is cost-sensitive ap- plications of already developed techniques from fields such us information retrieval or text categorization. Following this side we are making a collaborative fil- tering approach that uses the concept of node in- terconnections for information exchange which is the main architecture of a peer to peer network. Col- laborative filtering reflects the method of exchanging preferences and annotations regarding the same cor- pora of documents and information. Following the axiom that SPAM is not send only to certain type of users, thus making it a global ”phenomenon”, we ad- dress the need for a collaborative filtering infrastruc- ture that will make accurate recommendations about the intention of an e-mail message. In the next para- graphs we make a categorization of current SPAM 1