Digital Music Lab: Analysing Big Music Data

This blog post was written during a presentation at the British Library Labs Symposium in November 2014. It is likely full of errors and omissions having been written real-time.

Adam Tovell, Digital Music Curator, British Library & Daniel Wolff, City University

Goal is to develop research methods and s/w infrastructure for exploring and analysing large-scale music collection & provide researchers and users with datasets and computational tool to analyse music audio, scores and metadata.

  • Develop and evaluate music research methods for big data
  • Develop and infrastructure (technical, insitutional, legal) for large-scale music analysis
  • Develop tools for larg-scale computational musicology
  • Use and produce Big Music Data sets

It is possible to use software to analyse aspects of a musical recording. For example looking at:
* Visualisation
* Timings
* Intonation
* Dynamics
* Chord progressions
* Melody

Derived data from s/w analysis can be used to inform research questions.

So far these approaches have been applied to small amounts of music

Field of Music Information Retrieval apply the same techniques to larger bodies of music. These kinds of approaches are behind things like some music recommendation services.

To bring together MIR techniques with musicology academic research approaches need a large body of recorded music – which is where the BL music collection comes in – enabling Large-scale Musicology. BL has over 400 different recording of Chopin’s Nocturne in E-flat major op.9, no.2 – you can ask questions like:
* how has performance changed over time?
* do performers influence each other?
* does place affect performance?
* etc.

BL music collections have over 3 million unique recordings covering a very wide range of genres – popular, traditional, classical, with detailed metadata and a legal framework for making them available to people – sometimes online, and sometimes on-site.

Musicological Questions
* Automatic analysis of scores
* structural analysis from audio
* analysing styles & trends over time
* new similarity metrics (e.g. performance based)
* …

Data sets currently being used:
* British Library – currently curating available music data collections from BL sound archive (currently done around 40k recordings)
* CHARM – 5000 copyright-free recordings + metadata
* ILikeMusic – commercial music library of 1.2M tracks

Analysis results so far:
* ILikeMusic – chord detection
* CHARM – instrumentation analysis
* MIDI-scal transcription
* High-res transcription (create scores from recording)
* BL – key detection, + more

Visualisations – available at http://dml.city.ac.uk

Automatic Tagging – e.g. genre, style, period. To expensive to tag large datasets, automated classification challenging especially without ‘ground truth’.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.