Blind Audio Source Separation - Combination
We deal with the case where the sources are linearly mixed and the
mixtures are underdetermined. Hence, A has more columns than rows.
Sparsity of the sources is vital for good separation. Bayesian methods
such as the Gibbs Sampler (a standard MCMC simulation method) are used
to estimate the sources and the mixing matrix in the presence of noise.
I.I.D. Gaussian noise was added to the observations, which resulted in
an SNR of about 16 dB. The mixing matrix used is given by A = [0.4000
0.8315 0.5657; -0.6928 -0.3444 0.5657].
5) Combination of Signals
These are the 3 percussion signals of different types.
Speech Signal 1
"While the Jeffers had reached its limit, it was now mid-August, which
meant he had been separated from Marshall from..."
Musical Signal 2
Piano
Percussion Signal 3
Low Frequency Drums
These are the 2 mixtures.
Mixture 1
Mixture 2
Sparsity Indices for various transform types.
Click on thumbnails for larger (and clearer) versions.
Transform Types
1 |
2 |
3 |
4 |
5 |
6 |
DCT |
MDCT |
WT (Vai) |
WT (Sym8) |
WPBB |
No Transform |
Results at a glance.
Performance Measures
1 |
2 |
3 |
4 |
SDR |
SIR |
SAR |
SNR |
5.1) Discrete Cosine Transform
Reconstructed Speech Signal 1
Reconstructed Musical Signal 2
Reconstructed Percussion Signal 3
5.2) Modified Discrete Cosine Transform
Reconstructed Speech Signal 1
Reconstructed Musical Signal 2
Reconstructed Percussion Signal 3
The MDCT is the best overall basis and for a combination of signals it
performs the best.
It closely approximates the optimal Karhunen-Loeve Transform.
5.3) Wavelet Transform: Vaidyanathan
Reconstructed Speech Signal 1
Reconstructed Musical Signal 2
Reconstructed Percussion Signal 3
Wavelets perform the worst in this experiment.
5.4) Wavelet Transform: Symmlet 8
Reconstructed Speech Signal 1
Reconstructed Musical Signal 2
Reconstructed Percussion Signal 3
Wavelets perform the worst in this experiment.
5.5) Wavelet Packet Best Basis
Reconstructed Speech Signal 1
Reconstructed Musical Signal 2
Reconstructed Percussion Signal 3
The Best Basis performs well too, but certainly the sound quality is
worse than the MDCT above.
5.6) No Transform
Reconstructed Speech Signal 1
Reconstructed Musical Signal 2
Reconstructed Percussion Signal 3
Surprisingly, even without a transform, the signals can be
reconstructed.
Back
Next
Home
Server at www.eng.cam.ac.uk