From: Multi-rate modulation encoding via unsupervised learning for audio event detection
Model | PSDS1 | PSDS2 |
---|---|---|
DCASE2023 baseline | 0.365 ± 0.010 | 0.581 ± 0.003 |
3\(\times\)CRNN random-init | 0.363 ± 0.004 | 0.594 ± 0.007 |
3\(\times\)CRNN VAE-init | 0.374 ± 0.003 | 0.607 ± 0.009 |
3\(\times\)CRNN ModVAE-init | 0.375 \(\varvec{\pm }\) 0.006 | 0.627 \(\varvec{\pm }\) 0.005 |