Volume 6 Number 1 (Jan. 2011)
Home > Archive > 2011 > Volume 6 Number 1 (Jan. 2011) >
JSW 2011 Vol.6(1): 116-123 ISSN: 1796-217X
doi: 10.4304/jsw.6.1.116-123

Application of Linear Classifier on Chinese Spam Filtering

Yongqin Qiu, Yan Xu, Dan Li

Beijing Language and Culture University, Beijing, China

Abstract—Spam is a key problem in electronic communication. Especially in large-scale email systems. Content-based filtering is one mainstream method of combating this threat in its forms, an e-mail filtering system can learn directly from a user’s mail set, but the previous Content-based filtering methods are hard to find a balance between efficiency and effectiveness. Such algorithms of text categorization as Naïve Bayes, kNN, Decision Tree and Boosting can be applied in spam filtering. However, the effectiveness of Naïve Bayes is limited and it is not fit for instant feedback learning. Others algorithm such as SVM are more effective but complicated to compute. Because in a real email system a large volume of emails often need to be handled in a short time, efficiency will often be as important as effectiveness when implementing an anti-spam filtering method. So we intend to find a linear classifier to solve this problem, two online linear classifiers: the Perception and Winnow were explored for this task, which are two fast linear classifiers. The training of these two methods is online and mistake driven. Furthermore, they are suitable for feedback. We employ the two methods in three benchmark corpora, including PU1, Ling spam and 2005-Jun, the experiments in public e-mail corpus show an effective result. We conclude that the two online linear classifiers have a state-of-the-art performance for filtering spam, especially for Chinese spam emails.

Index Terms—anti-spam, information filtering, Winnow, Perception, linear classifier.

[PDF]

Cite: Yongqin Qiu, Yan Xu, Dan Li, "Application of Linear Classifier on Chinese Spam Filtering," Journal of Software vol. 6, no. 1, pp. 116-123, 2011.

General Information

  • ISSN: 1796-217X (Online)

  • Abbreviated Title: J. Softw.

  • Frequency:  Quarterly

  • APC: 500USD

  • DOI: 10.17706/JSW

  • Editor-in-Chief: Prof. Antanas Verikas

  • Executive Editor: Ms. Cecilia Xie

  • Abstracting/ Indexing: DBLP, EBSCO,
           CNKIGoogle Scholar, ProQuest,
           INSPEC(IET), ULRICH's Periodicals
           Directory, WorldCat, etc

  • E-mail: jsweditorialoffice@gmail.com

  • Oct 22, 2024 News!

    Vol 19, No 3 has been published with online version   [Click]

  • Jan 04, 2024 News!

    JSW will adopt Article-by-Article Work Flow

  • Apr 01, 2024 News!

    Vol 14, No 4- Vol 14, No 12 has been indexed by IET-(Inspec)     [Click]

  • Apr 01, 2024 News!

    Papers published in JSW Vol 18, No 1- Vol 18, No 6 have been indexed by DBLP   [Click]

  • Jun 12, 2024 News!

    Vol 19, No 2 has been published with online version   [Click]