TY - GEN
T1 - A Multi-Task Music Artist Classification Network
AU - Panda, Swaroop
AU - Namboodiri, Vinay P.
PY - 2020/2/29
Y1 - 2020/2/29
N2 - Music Artist Classification is a challenging task in Music Information Retrieval. There exist methods that are based on either signal processing features or deep learning algorithms. While signal processing based approaches do perform well, the challenging task implies that signal processing alone does not suffice to provide good features for the task. Other approaches that rely on deep learning based techniques learn representations from large amount of labelled data. A limitation to this approach is the requirement for large amount of annotated corpus for obtaining a good set of parameters. In this work, we pose auxiliary signal processing based tasks for a deep learning network that includes predicting the harmonic-percussive and vocal-non vocal components of audio files. We show that training on a combination of these tasks provides us with a well trained CNN with a good set of parameters; which can then further be used for the artist classification task. We use a Multi-Task framework and a popular deep learning architecture to train the model jointly on the artists classification and the auxiliary tasks. We observe that vocal-non vocal separation along with alignment prediction proves to be a good auxiliary task for artist classification, improving the baseline by 6%.
AB - Music Artist Classification is a challenging task in Music Information Retrieval. There exist methods that are based on either signal processing features or deep learning algorithms. While signal processing based approaches do perform well, the challenging task implies that signal processing alone does not suffice to provide good features for the task. Other approaches that rely on deep learning based techniques learn representations from large amount of labelled data. A limitation to this approach is the requirement for large amount of annotated corpus for obtaining a good set of parameters. In this work, we pose auxiliary signal processing based tasks for a deep learning network that includes predicting the harmonic-percussive and vocal-non vocal components of audio files. We show that training on a combination of these tasks provides us with a well trained CNN with a good set of parameters; which can then further be used for the artist classification task. We use a Multi-Task framework and a popular deep learning architecture to train the model jointly on the artists classification and the auxiliary tasks. We observe that vocal-non vocal separation along with alignment prediction proves to be a good auxiliary task for artist classification, improving the baseline by 6%.
KW - Artificial neural networks
KW - Classification algorithms
KW - Source separation
UR - http://www.scopus.com/inward/record.url?scp=85085035770&partnerID=8YFLogxK
U2 - 10.1109/CINE48825.2020.234390
DO - 10.1109/CINE48825.2020.234390
M3 - Conference contribution
AN - SCOPUS:85085035770
T3 - 4th International Conference on Computational Intelligence and Networks, CINE 2020
BT - 4th International Conference on Computational Intelligence and Networks, CINE 2020
PB - IEEE
CY - Piscataway, NJ
T2 - 4th International Conference on Computational Intelligence and Networks, CINE 2020
Y2 - 27 February 2020 through 29 February 2020
ER -