Andreas Rauber - Workshop on Data Analysis | |
RESEARCH TEACHING / LEHRE finished Thesis / (rauber@ifs.tuwien.ac.at) Home: http://www.ifs.tuwien.ac.at/~andi. |
  (link to previous seminar homepages: WDA 2003 (with photos) in Ruzomberok, WDA 2002 (with photos) in Kosice, WDA 2001 in Budapest, WDA 2000 (with photos) in Kosice)
Fifth Workshop on Data Analysis (WDA2004)June 24-27 2004Tatranska Polianka, Slovak Republic OverviewSummary2004 saw the 5th Anniversary of the Workshop on Data Analysis. Such an occasion naturally requires an appropriate setting, which was found in the \emph{Sliezsky Dom, Tatranska Polianka}, in the High Tatras region - at 1.670 meters elevation on of the highest located hotel in the Slovak Republic. The Workshop, held June 24-27 2004, consisted of three main thematic sessions, focusing on supervised learning, on Cluster Analysis, as well as general machine learning applications and representational issues. Within the supervised learning domain, a strong focus was on methods applied to automatic text classification (ATC), comparing different feature selection or document pre-processing methods or the integration of thesaurus information, to increase the quality of the trained classifiers, with a wide range of technologies, such as Rocchio, Support Vector Machines, Multi-Layer Perceptrons or Decision Trees being used. These were applied to general text classification tasks, as well as more specific domains like the automatic population of ontologies from text, or the automatic genre classification of music based on frequency spectra analysis. The second thematic area on Cluster Analysis focused mainly on a prominent technology for topology preserving mappings, namely the self-organizing maps. A range of quality measures and visualizations was discussed, followed by a presentation of extensions to the basic SOM model to create map spaces of different shapes. Furthermore, improvements in feature space representation by incorporating part-of-speech tagging for textual data was presented. The third section focused on various application domains as well as representational questions. Specifically, genetic algorithms for function recognition, as well as algorithms for the integration distributed ranked value data were presented, followed by approaches to object-oriented representations of XML-structured data. As last year, a specific break-out session was held to reflect on the limits of data analysis in general, on particular the limits of text mining and the World Wide Web, by continuing last year's discussion based on the two pieces of literary work, namely \emph{The Library of Babel} by Jorge Luis Borges, as well as a short story by Umberto Eco, namely \emph{On the Impossibility to Draw a Map of the Empire on a Scale of 1 to 1}. Given the splendid location for this year's WDA, following the scientific summits climbed during the sessions, an excursion took the participants to another summit, namely Vychodna Vysoka at 2.428 meters. The high-quality presentations in this Workshop spawned intensive discussions, resulting in a fascinating, dense scientific program. (It also managed to solve the puzzle about the first appearance and usage of the famous IRIS data set as a datamining benchmark.) WDA 2004 was also pleased to welcome several long-term participants to this 5th anniversary. Generally, we would like to again thank all participants for joining and contributing to the tremendous success of this workshop. Special thanks also again go to the Austrian Exchange Service and the Slovak Academic Information Agency, who under project number 45s10 generously supported this workshop within the Austrian-Slovak cooperation program.Program
Some PhotosBelow some photos, taken during the seminar and the trip up Vychodna Vysoka (2.428m), and the Conference dinner celebrating 5 years of WDA, and Kosice. (Click on the images to enlarge the thumbnails.) |