The self-organizing map [2,3], SOM, is one of the most prominent neural network models adhering to the unsupervised learning paradigm. The model consists of p units, each of which is assigned an n-dimensional weight vector mi, . Training is performed as a repetition of (i) input pattern presentation, (ii) selection of the unit with the closest weight vector, i.e. the winner, and (iii) adaptation of the weight vectors of the winner and of a number of units in the neighborhood of the winner. It is important to guarantee that both the strength of adaptation as well as the number of units that are adapted apart from the winner are decreasing in time. The distinctive feature of the self-organizing map is that similar input patterns will be arranged in neighboring regions of the resulting map. Hence, the similarity between input data items, i.e. the documents in our application, is mirrored in terms of the distance of the respective winners within the map.
In case of a digital library that exists only distributed over several sites, it might be more efficient to have independent self-organizing maps that represent the various parts of the digital library than transfering the whole information to one site for training. However, when some form of uniform access to the data is requested, the contents of the various sites has to be integrated. With our approach to digital library organization we suggest to utilize self-organizing maps to perform such an integration. In particular, the map that shall integrate different portions of the digital library may be trained by using the weight vectors of the maps to be integrated. Such a strategy may be applied recursively in order to build hierarchies of arbitrary depth as shown in Figure 1. In this figure a and a map are integrated in a map. Note that also selected parts of self-organizing maps may be integrated by using essentially the same architecture. The user simply selects areas of interest scattered across different maps for which an integration shall be performed. By this, the user may tie together pieces of information to build her own library fine-tuned to her particular interests.
The effect of such an integration, obviously, is that input data items that are separated in different low level maps are grouped together in the high level map. Input data that are mapped onto the same low level unit are represented together in the high level map.