Department of Software Technology
Vienna University of Technology
Lessons Learned in Text Document Classification
Text archives may be regarded as an almost optimal application arena
for unsupervised neural networks.
This because many of the operations computers have to perform on text
documents are classification tasks based on noisy patterns.
As a natural result, an ever increasing number of research reports
concerned with that type of application appeared in literature.
In this paper we argue in favor of paying more attention to the fact that
text archives lend themselves naturally to a hierarchical structure.
We take advantage of this fact by using a hierarchically organized
network built up from independent self-organizing maps in order to
enable the true establishment of a document taxonomy.
Up
Comments: rauber@ifs.tuwien.ac.at