Million Song Dataset Benchmarks - Downloads

Complete Feature Files (in WEKA ARFF Format)

Feature Set Version Dimensions Download md5 Size
Rhythm Patterns Features (config)
Rhythm Patterns 1.0 1440 msd-rp-v1.0.arff.gz md5 4,13GB
Statistical Spectrum Descriptors 1.0 168 msd-ssd-v1.0.arff.gz md5 641MB
Rhythm Histograms 1.0 60 msd-rh-v1.0.arff.gz md5 240MB
Temporal Statistical Spectrum Descriptors 1.0 1176 msd-tssd-v1.0.arff.gz md5 3,97GB
Temporal Rhythm Histograms 1.0 420 msd-trh-v1.0.arff.gz md5 1,48GB
MVD 1.0 420 msd-mvd-v1.0.arff.gz md5 1,32GB
Marsyas (config)
MARSYAS timbral features 1.0 124 msd-marsyas-timbral-v1.0.arff.gz md5 412MB
jMir (details)
Low-level features
(Spectral Centroid, Spectral Rolloff Point, Spectral Flux, Compactness, Spectral Variability, Root Mean Square, Zero Crossings, and Fraction of Low Energy Windows)
1.0 16 msd-jmir-spectral-all-all-v1.0.arff.gz md5 52MB
Low-level features derivatives 1.0 96 msd-jmir-spectral-all-all-v1.0.arff.gz md5 287MB
Method of Moments 1.0 10 msd-jmir-methods-of-moments-all-v1.0.arff.gz md5 36MB
Area of Moments 1.0 20 msd-jmir-area_of_moments-all-v1.0.arff.gz md5 67MB
Linear Predictive Coding (LPC) 1.0 20 msd-jmir-lpc-all-v1.0.arff.gz md5 54MB
MFCC features 1.0 26 msd-jmir-mfcc-all-v1.0.arff.gz md5 72MB

Ground Truth assignments

All Music Guide genres (http://allmusic.com)

Ground Truth Name Number of Labels Descritpion
MSD Allmusic Genre Dataset (MAGD)422,714details
MSD Allmusic Top Genre Dataset (Top-MAGD)406,427details
MSD Allmusic Style Dataset(MASD)273,936details

Classification Tasks

Description
Non-stratified splits
90% training data   MAGD Top-MAGD MASD
80% training data   MAGD Top-MAGD MASD
66% training data   MAGD Top-MAGD MASD
55% training data   MAGD Top-MAGD MASD
Stratified splits
90% training data   MAGD Top-MAGD MASD
80% training data   MAGD Top-MAGD MASD
66% training data   MAGD Top-MAGD MASD
55% training data   MAGD Top-MAGD MASD
Splits with fixed size per genre
1,000 samples training data / genre set   MAGD Top-MAGD MASD
2,000 samples training data / genre set   MAGD Top-MAGD MASD

Statistic Files

Description Download
Missing tracks missing_tracks.txt
All sample properties sample_properties.csv.gz
Mp3 bitrate statistics bitRateStats.csv
Mp3 channel statistics channelStats.csv
Mp3 samplerate statistics sampleRateStats.csv
Mp3 sample length statistics songLengthStats.csv

Scripts

Description Version Link
Assign ground truth labels to feature file 1.0 download
Merge feature file with ground truth 1.0 download
Merge feature file with multi label ground truth 1.0 download
Filter specific features from feature file 1.0 download