Companion page for the chapter

Musical audio decomposition through nonnegative factorizations of the power spectrogram and the Itakura-Saito divergence

by Cédric Févotte

 in "Machine Audition: Principles, Algorithms and Systems" edited by W. Wang


The editorial staff at IGI introduced important errors in the published version of this chapter; in particular one was figure has been split in two parts, another figure is missing and references to figures and tables have been messed. A close to error-free document produced from the original LaTex source is available here.


1) Analysis of a short piano excerpt

Data

IS-NMF on power spectrogram
IS-NMF on power spectrogram
pitch =   65.0   68.0   61.0  72.0  0   0   0   0
Component 1
Component 2
Component 3
Component 4
Component 5
Component 6
Component 7
Component 8

KL-NMF on magnitude spectrogram
KL-NMF on magnitude spectrogram
pitch =   65.2   68.2   61.0  72.2  0   56.2   0   0
Component 1
Component 2
Component 3
Component 4
Component 5
Component 6
Component 7
Component 8




2) Decomposition of real stereo recording

Stereo data


IS-NMF on power spectrogram
(K = 8 and J = 4)
Stereo decomposition
Component 1 (Source 1)
Component 2 (Source 1)
Component 3 (Source 2)
Component 4 (Source 2)
Component 5 (Source 3)
Component 6 (Source 3)
Component 7 (Source 4)
Component 8 (Source 4)

Manually bound
source estimates

Bass + Guitars + Synth
Drums
Voice


Manually bound source estimates
from a decomposition with K=24 components equally shared between J=4 sources

Guitar 1
Guitar 2
Bass
Drums
Synth
Voice