Department of Software Technology
Vienna University of Technology
GHSOM - Experiments
Overview
The GHSOM was applied in different application domains. Below we provide links to experiments results of applying the GHSOM for the organization of different types of data, specifically, for text archives and music data.
- Der Standard 1999
A collection of all articles from the Austrian Daily Newspaper Der Standard.
This collection includes about 50.000 German language text files, or about 355 MB of HTML-Text.
Available results: We provide results of a hierarchical
classification using our GHSOM model. Results available include the
feature vectors, and labeled hierarchical archives using the GHSOM to allow comparison of the effects of different parameter settings.
- Russian Information Agency Nowosti (RIAN)
A collection of articles from the Russian News Agency Novosti in several languages, such as Russian, English, French, German, Arabic, providing an ideal setting for multilingual experiments using the SOMLib system.
A non-parallel corpus of articles from a 14-day period in March 2001 is used to demonstrate the language-independence of the SOMLib system.
Furthermore, all articles were automatically translated to provide a single view of a multi-lingual document collection. In spite of the noise introduced by the low-quality automatic translation, the SOMLib system succeeds in detecting correct topic hierarchies.
Available results: Both separate GHSOM topic hierarchies for the individual languages, as well as the combined translated hierarchy are available for interactive exploration, together with the articles and respective feature vectors.
- Music Archive - Collection 359:
A large collection of 359 pieces of popular music from a variety of genres is used as a basis for the experiments presented in this section.
Data files, maps, as well as detailed descriptions of the various musical styles detected by the map are presented, and links to MP3-files allow you to analyze the map's organization interactively.
Up to the GHSOM Homepage
Comments: rauber@ifs.tuwien.ac.at