MUSCLE eTeam:
Semantic from Audio and Genre Classification for Music
(MUSCLE NoE
| All
e-teams)
Overview
This eTeam will collaborate on different methods for audio feature
extraction and their appliance in both supervized classification as well as unsuperviced organization as a means to access and explore audio holdings such as sound archives or, particularly, music. The eTeam
partners have strong expertise on extracting descriptors from audio
data, specialized on music, instruments and other sounds. Moreover,
there is also expertise on text mining and therefore textual genre
analysis will be combined with the audio-based approaches.
Furthermore, the partners have core competencies in the application of machine learning techniques for the analysis and structuring of content, and subsequent visualization in 2D as well as 3D environments.
The eTeam will be a concentrated effort on:
- feature extraction from audio
- music classification (based on audio, symbolic
notations, text mining and combined approaches)
- instrument or sound classification
- organization of music/sound by perceptual similarity
- visualizations and access interfaces for the exploration of sound collections in 2D and 3D
Furthermore, the feature sets evaluated in the classification
activities will be employed in unsupervized machine learning tasks in
order to provide an automatic clustering of audio archives, which in
turn serves as an interface for browsing and exploration. eTeam
partners will bring in expertise on visualization, providing intuitive
interfaces for both 2D visualizations as well as interactive 3D
environments for future access models to audio archives.
Results of the eTeam will be:
- Evaluation of different features sets for the
classification of music, instruments and other sounds
- Evaluation of multi-modal approaches (combining textual,
symbolic and audio features)
- Evaluation of performance of those approaches on different
classification methods
- Evaluation of the feature sets on unsupervized machine
learning
- Visualization of audio archives by
application of the feature sets
- Development of Prototypes of interactive applications (2D +
3D) for novel browsing methods for audio archives.
- Jointly written publications
- Exchanges between the eTeam partners
Participants
Contribution of partners
- TU Wien - IFS (A. Rauber, T. Lidy)
We have active a number of activities in the field of audio analysis,
particularly feature extraction for audio retrieval, mostly for genre
classification. We are participating in the annual ISMIR MIREX
evaluation
contest in these disciplines. Moreover we are also active in text
mining and investigate approaches for textual genre analysis through
the use of web data.
Another focus topic, that would allow
some overlap and integration with the User Interface WP, is interfaces
to audio collections, based on such extracted features and subsequent
Machine Learning. We are collaborating on interfaces to audio
archives with the EC3 group.
- EC3 (H. Berger, M. Dittenbach)
The EC3 competence center is active in competence fields such as
information logistics (organisation of web-based distributed processes,
especially web services and interoperable solutions for the design of
virtual enterprises), structured content organisation
(automated language processing for the analysis and design of
natural-language user interfaces as well as complex
information systems) and customer and business analysis.
The competence center has expertise in knowledge management and
text mining.
It is specifically active in the development of applications for browsing,
interaction and retrieval from document archives, where a core focus is on the development of 3D environments for the exploration of
text or audio archives.
- AIIA - AUTH (C. Kotropoulos,
G. Benetos)
The activity regards the automatic musical instrument classification of
isolated tones and sound segments by extracting timbral and MPEG-7
Audio features using Non-Negative Matrix factorization (NMF). A joint
publication on supervised classifiers based on the
non-negative matrix factorization (NMF), evaluating two different
feature sets, has been has been written together with TU Wien
- IFS.
Previous activities
- evaluation of feature sets through the MIREX evaluation
forum
- joint publication of AIIA
- AUTH and TU Wien - IFS at
EUSIPCO 2006:
Emmanouil Benetos, Constantine Kotropoulos, Thomas
Lidy, Andreas Rauber: "Testing
supervised classifiers
based on non-negative matrix factorization to musical instrument
classification".
Proceeedings of the 14th
European Signal Processing Conference, Florence, Italy,
September 4-8, 2006 (PDF)
- collaboration between EC3 and TU Wien - IFS on
interactive interfaces to audio collections
Tentative plan of activities
- Collection of
features sets used in the eTeam:
The result of the eTeam in this first stage would thus provide a
state-of-the-art collection of the capabilities and competences
on audio feature extraction.
- Feature
Extraction:
The feature extraction algorithms will be run on the test databases.
For textual genre analysis and combined approaches, additional data has
to be aquired at this step.
- Audio
Classification:
Extracted audio features will be tested on several classification
methods. There will be a separate evaluation for each discipline: music
classification, instrument classification and classification of other
sounds. Results will indicate both algorithmic as well as computational
performance and will be published on the eTeam web site.
- Classification
with multi-modal approaches:
Combined approaches including textual, symbolic and audio features will
be evaluated on the same classifiers as the plain audio features.
Results will be published on the eTeam web site.
- Clustering of
audio archives:
The test databases used in classification will be employed together
with the evaluated feature sets for unsupervized clustering, using
Self-Organizing Maps.
- Visualization
of audio archives:
Based on the clustered archives, a number of different visualizations
(views) of the each of the archives (music collections, instrument
archives, etc.) will be created and published on the eTeam web site.
- Prototype of
browsing application:
A prototype for access to the audio archives based on the clusterings
and visualizations will be created, suitable for browsing and access
through a web browser.
- Interactive
environments:
A prototype of an interactive application will be developed, providing
interaction with audio archives in a 3D environment.
The eTeam fosters collaborations
between the participants and the exchange of know-how in the different
domains and expertises described.
eTeam activites will be
supported by exchange of researchers between the eTeam institutions as
well as writing joint publications (conference papers, articles).
Contact:
Andreas Rauber
Dept. of Software Technology and Interactive Systems
Vienna Univ. of Technology
Favoritenstr. 9 - 11 / 188
A - 1040 Wien
AUSTRIA
e-mail: rauber@ifs.tuwien.ac.at
http://www.ifs.tuwien.ac.at/~andi/