Finally I managed to put almost everything I have done research on together in one volume, and here is the outcome:
Authors:Daniel Wolff
Music Informatics Group, City University London.
Abstract:
Understanding how listeners relate and compare pieces of music is a fundamental challenge in music research as well as for commercial applications: Today’s large-scale applications for music recommendation and exploration utilise various models for similarity prediction to satisfy users’ expectations. Perceived similarity is specific to the individual and influenced by a number of factors such as cultural background and age. Thus, adapting a generic model to human similarity data is useful for personalisation and can help to better understand such differences.
This thesis presents new and state-of-the-art machine learning techniques for modelling music similarity and their first evaluation on relative music similarity data. We expand the scope for future research with methods for similarity data collection and a new dataset. In particular, our models are evaluated on their ability to “spot the odd song out” of three given songs. While a few methods are readily available, others had to be adapted for their first application to such data. We explore the potential for learning generalisable similarity measures, presenting algorithms for metrics and neural networks. A generic modelling workflow is presented and implemented.
We report the first evaluation of the methods on the MagnaTagATune dataset showing learning is possible and pointing out particularities of algorithms and feature types. The best results with up to 74% performance on test sets were achieved with a combination of acoustic and cultural features, but model training proved most powerful when only acoustic information is available. To assess the generalisability of the findings, we provide a first systematic analysis of the dataset itself. We also identify a bias in standard sampling methods for cross-validation with similarity data and present a new method for unbiased evaluation, providing use cases for the different validation strategies.
Furthermore, we present an online game that collects a new similarity dataset, including participant attributes such as age, location, language and music background. It is based on our extensible framework which manages storage of participant input, context information as well as selection of presented samples. The collected data enables a more specific adaptation of music similarity by including user attributes into similar- ity models. Distinct similarity models are learnt from geographically defined user groups in a first experiment towards the more complex task of culture-aware similarity modelling. In order to improve training of the specific models on small datasets, we implement the concept of transfer learning for music similarity models. — Currently only works on the CHROME / CHROMIUM browser —
Download the thesis here .