"Clustering based ensemble classification for spam filtering
in:"Proceedings of the 6th Workshop on Data Analysis
", Elfa Academic Press, 2006, S. 11 - 22.
[ Publication Database
Abstract. Spamﬁltering has become a very important issue throughout the last years as unsolicited bulk e-mail imposes large problems in terms of both the amount of time spent on and the resources needed to automaticallyﬁlter those messages. Text information retrieval offers the tools and algorithms to handle text documents in their abstract vector form. Thereon, machine learning algorithms can be applied. This work deals with the possible improvements gained from ensembles, i.e. multiple, differing classiﬁers for the same task. Those individual classiﬁers canﬁt parts of the training data better and therefore may improve classiﬁcation results, when the bestﬁtting classiﬁer can be found. Basic classiﬁcation algorithms as well as clustering are introduced. Furthermore the application of the ensemble idea is explained and experimental results are presented.