Notes personnelles sur EUSIPCO 2017
Plenary talk 1: Machine Learning Approaches for Solving Inverse Imaging Problems, Aggelos Katsaggelos
- Subject: compare and contrast analytical and learning approaches in solving inverse problems: inpainting, concealment, etc.
- Analytical approaches: typically require one observation, Learning approaches: large amount of data
- Learning approaches: hard to incorporate domain knowledge
- A very interesting recent idea: use feature-space loss instead of input-output MSE: compute MSE on layer activations. Why? because we then use semantically-meaningfull features
- GANs: a minimax two-player game. People use conditional GANs in practice.
- GANs and (V)AEs are ways to learn prior distributions about the data and are a way to incorporate knowledge in learning approaches
- CNNs >> FCNNs:
- less parameters,
- translation invariance and locality
Plenary talk 2: On Learning Invariants and Representation Spaces of Shapes and Forms, Ron Kimmel
Plenary talk 3: Linearly-convergent stochastic gradient algorithms, Francis Bach
- Slides available (PDF)
- Subject: gradient descent algorithms, optimization
- Proposes Stochastic Average Gradient algos: SAG, SVRG, SAGA
- Idea: at each iteration, update a single parameter chosen randomly instead of all the params, update based on a running mean of the gradients
Plenary talk 4: The Power of Low Rank Tensor Approximations in Smart Patient Monitoring, Sabine Van Huffel
- Slides available (PDF)
- Subject: low rank tensor approximations as building blocks to make these mathematical decompositions “interpretable” such that they reveal the underlying clinically relevant information and improve medical diagnosis
- Application 1: Irregular heartbeat detection with Kronecker Product Equations compared to PARAFAC2 and Multilinear SVD (MSVD), removal of water frequency component with Hankel SVD (HSVD)
- Application 2: brain tumor detection
- Method 1: NMF: Y=FFT approx as WH where W: weights, H: activations
- Method 2: hierarchical NMF: first, detect presence of tumor, second detect malignant/non-malignant
- Method 3: (Non-negative) Canonical Polyadic Decomposition (CPD and NCPD)
- Software to manipulate tensors: www.tensorlab.net
Plenary talk 5: Speech synthesis: where did the signal processing go? Simon King
- Slides available (PDF)
- Subject: The most recent boost to quality in speech synthesis has come about through a convergence of acoustic modelling and waveform generation, in which the model directly generates a waveform
- Teaching Website on speech: http://www.speech.zone/
- WORLD: open source vocoder, A high-quality speech analysis, manipulation and synthesis system: github repo
- WaveNet outputs 8-bit quantized waveform samples
- Audio samples from "Tacotron: A Fully End-to-End Text-To-Speech Synthesis Model": html link
Datasets
- The Million Song Dataset: a freely-available collection of audio features and metadata for a million contemporary popular music tracks
- Stanford background dataset (14.0MB): 715 images of outdoor scenes, contain at least one foreground object, and have the horizon position within the image
- 2D Semantic Labeling - Vaihingen data
- wTIMIT: whispered TIMIT
- PASCAL VOC: image data sets for object class recognition
- IFADV: Dialog Video corpus, annotated video recordings of friendly Face-to-Face dialogs
- UrbanSound dataset: 1302 labeled sound recordings with 10 classes: air_conditioner, car_horn, children_playing, dog_bark, drilling, enginge_idling, gun_shot, jackhammer, siren, and street_music
- MAPS : A piano database for multipitch estimation and automatic transcription of music
List of papers