Combination of Audio and Lyrics Features for Genre Classification in Digital Audio Collections

R. Mayer, R. Neumayer, A. Rauber:
"Combination of Audio and Lyrics Features for Genre Classification in Digital Audio Collections";
Vortrag: ACM Multimedia, Vancouver, Canada; 27.10.2008 - 31.10.2008; in:"Proceedings of the ACM Multimedia 2008", ACM New York, NY, USA, (2008), ISBN: 978-1-60558-303-7; S. 159 - 168.

[ Publication Database ]


In many areas multimedia technology has made its way into mainstream. In the case of digital audio this is manifested in numerous online music stores having turned into profitable businesses. The widespread user adaption of digital audio both on home computers and mobile players show the size of this market. Thus, ways to automatically process and handle the growing size of private and commercial collections become increasingly important; along goes a need to make music interpretable by computers. The most obvious representation of audiofiles is their sound - there are, however, more ways of describing a song, for instance its lyrics, which describe songs in terms of content words. Lyrics of music may be orthogonal to its sound, and differ greatly from other texts regarding their (rhyme) structure. Consequently, the exploitation of these properties has potential for typical music information retrieval tasks such as musical genre classification; so far, there is a lack of means to efficiently combine these modalities. In this paper, we presentfindings from investigating advanced lyrics features such as the frequency of certain rhyme patterns, several parts-of-speech features, and statistic features such as words per minute (WPM). We further analyse in how far a combination of these features with existing acoustic feature sets can be exploited for genre classification and provide experiments on two test collections.