Study of the effect of various transforms
on performance in blind audio source separation
V.
Y. F Tan and C.
Févotte
We deal with the case where the sources are linearly mixed and the
mixtures are underdetermined. Hence, A has more columns than rows.
Sparsity of the sources is vital for good separation. Bayesian methods
such as the Gibbs Sampler (a standard MCMC simulation method) is used
to estimate the sources and the mixing matrix in the presence of noise.
I.I.D. Gaussian noise was added to the observations, which resulted in
an SNR of about 16 dB. The mixing matrix used is given by A = [0.4000
0.8315 0.5657; -0.6928 -0.3444 0.5657].
1) Introduction: Orthonormal Bases
6 different orthonormal bases were used on 4 different sets of signals.
The orthonormal bases include the DCT-IV, the MDCT, 2 different
versions of the DWT, Wickerhauser's Wavelet Packet Best Basis and no
transform. These bases were tested on speech signals, musical signals
and percussion signals and a combination of the three.
The model used is the standard BSS model.
x = As+n
Back
Next
Home
Server at www.eng.cam.ac.uk