Creating a SOMLib Digital Library

Department of Software Technology
Vienna University of Technology

Lessons Learned in Text Document Classification Text archives may be regarded as an almost optimal application arena for unsupervised neural networks. This because many of the operations computers have to perform on text documents are classification tasks based on noisy patterns. As a natural result, an ever increasing number of research reports concerned with that type of application appeared in literature. In this paper we argue in favor of paying more attention to the fact that text archives lend themselves naturally to a hierarchical structure. We take advantage of this fact by using a hierarchically organized network built up from independent self-organizing maps in order to enable the true establishment of a document taxonomy.

Up

Comments: rauber@ifs.tuwien.ac.at