From: Supervised Attention Multi-Scale Temporal Convolutional Network for monaural speech enhancement
 | No RIR | RIR | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
SNR | − 5 dB | 0 dB | 5 dB | 10 dB | 15 dB | 20 dB | \(\Delta\)Avg. | − 5 dB | 0 dB | 5 dB | 10 dB | 15 dB | 20 dB | \(\Delta\)Avg. |
Unprocessed | 83.1 | 88.4 | 91.7 | 94.7 | 96.8 | 98.1 | - | 77.1 | 83.9 | 88.2 | 92.3 | 95.2 | 97.1 | - |
CRN | 86.7 | 91.1 | 93.5 | 95.4 | 96.8 | 97.6 | 1.4 | 82.2 | 87.7 | 90.7 | 93.4 | 95.3 | 96.5 | 2.0 |
MSTCN | 87.0 | 91.4 | 93.8 | 95.9 | 97.3 | 98.1 | 1.8 | 83.2 | 88.7 | 91.7 | 94.4 | 96.3 | 97.6 | 3.0 |
LSTM-IRM | 89.5 | 93.2 | 95.2 | 96.9 | 98.0 | 98.7 | 3.2 | 86.0 | 90.7 | 93.3 | 95.6 | 97.1 | 98.2 | 4.5 |
GCRN | 87.7 | 91.7 | 93.8 | 95.6 | 96.7 | 97.5 | 1.7 | 83.4 | 88.5 | 91.3 | 93.7 | 95.3 | 96.4 | 2.4 |
GaGNet | 89.5 | 91.6 | 93.9 | 95.8 | 97.1 | 98.0 | 2.2 | 83.3 | 88.6 | 91.5 | 94.2 | 95.9 | 97.2 | 2.8 |
Conv-TasNet | 89.7 | 93.2 | 95.0 | 96.6 | 97.6 | 98.4 | 2.9 | 85.4 | 90.1 | 92.6 | 95.0 | 96.6 | 97.7 | 3.9 |
DCCRN | 89.3 | 93.1 | 95.1 | 96.8 | 97.9 | 98.6 | 3.0 | 85.5 | 90.4 | 93.0 | 95.4 | 96.9 | 98.0 | 4.2 |
DPCRN | 89.2 | 92.9 | 94.9 | 96.6 | 97.7 | 98.5 | 2.9 | 86.1 | 90.8 | 93.3 | 95.6 | 97.1 | 98.2 | 4.5 |
SA-MSTCN\(^{1}\) | 90.6 | 94.0 | 95.8 | 97.2 | 98.2 | 98.8 | 3.7 | 87.9 | 92.1 | 94.3 | 96.2 | 97.6 | 98.4 | 5.4 |
SA-MSTCN\(^{2}\) | 90.7 | 942 | 95.9 | 97.3 | 98.3 | 98.8 | 3.8 | 87.9 | 92.1 | 94.4 | 96.3 | 97.6 | 98.5 | 5.5 |