The self-organizing map [2,3], SOM, is one of the
most prominent neural network models adhering to the unsupervised
learning paradigm.
The model consists of p units, each of which is assigned an n-dimensional
weight vector mi,
.
Training is performed as a repetition of (i) input pattern presentation,
(ii) selection of the unit with the closest weight vector, i.e. the winner,
and (iii) adaptation of the weight vectors of the winner and of a number
of units in the neighborhood of the winner.
It is important to guarantee that both the strength of adaptation as well as
the number of units that are adapted apart from the winner are decreasing
in time.
The distinctive feature of the self-organizing map is that similar input patterns will be
arranged in neighboring regions of the resulting map.
Hence, the similarity between input data items, i.e. the documents in our
application, is mirrored in terms of the distance of the respective winners
within the map.
In case of a digital library that exists only distributed over
several sites, it might be more efficient to have independent
self-organizing maps that represent the various parts of the
digital library than transfering the whole information to one site
for training.
However, when some form of uniform access to the data is requested, the
contents of the various sites has to be integrated.
With our approach to digital library organization we suggest to utilize
self-organizing maps to perform such an integration.
In particular, the map that shall integrate different portions of the
digital library may be trained by using the weight vectors of the
maps to be integrated.
Such a strategy may be applied recursively in order to build hierarchies
of arbitrary depth as shown in Figure 1.
In this figure a
and a
map are integrated
in a
map.
Note that also selected parts of self-organizing maps may be integrated
by using essentially the same architecture.
The user simply selects areas of interest scattered across different maps
for which an integration shall be performed.
By this, the user may tie together pieces of information to build her
own library fine-tuned to her particular interests.
The effect of such an integration, obviously, is that input data items that are separated in different low level maps are grouped together in the high level map. Input data that are mapped onto the same low level unit are represented together in the high level map.