English  |  正體中文  |  简体中文  |  Items with full text/Total items : 27287/39131
Visitors : 2442427      Online Users : 33
RC Version 4.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.
Scope Adv. Search

Please use this identifier to cite or link to this item: http://ntour.ntou.edu.tw:8080/ir/handle/987654321/27864

Title: Near-Duplicate mail detection based on URL information for spam filtering
Authors: Chun-Chao Yeh;Chia-Hui Lin
Contributors: NTOU:Department of Computer Science and Engineering
Date: 2006
Issue Date: 2011-10-21T02:34:16Z
Publisher: Lecture Notes in Computer Science (LNCS)
Abstract: Abstract:Due to fast changing of spam techniques to evade being detected, we argue that multiple spam detection strategies should be developed to effectively against spam In literature, many proposed spam detection schemes used similar strategies based on supervised classification techniques such as naive Baysian, SVM, and K-NN But only few works were on the strategy using detection of duplicate copies In this paper, we propose a new duplicate-mail detection scheme based on similarity of mail context between incoming mails, especially the context of URL information We discuss different design strategies to against possible spam tricks to avoid being detected Also, We compared our approaches with four different approaches available in literature: Octet-based histogram method, I-Mach, Winnowing, and identical matching With over thousands of real mails we collected as testing data, our experiment results show that the proposed strategy outperforms the others Without considering compulsory miss, over 97% of near duplicate mails can be detected correctly.
Relation: 3691, pp.842-851
URI: http://ntour.ntou.edu.tw/handle/987654321/27864
Appears in Collections:[資訊工程學系] 期刊論文

Files in This Item:

File Description SizeFormat

All items in NTOUR are protected by copyright, with all rights reserved.


著作權政策宣告: 本網站之內容為國立臺灣海洋大學所收錄之機構典藏,無償提供學術研究與公眾教育等公益性使用,請合理使用本網站之內容,以尊重著作權人之權益。
網站維護: 海大圖資處 圖書系統組
DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library IR team Copyright ©   - Feedback