From: Supervised Attention Multi-Scale Temporal Convolutional Network for monaural speech enhancement
 | No RIR | RIR | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
SNR | − 5 dB | 0 dB | 5 dB | 10 dB | 15 dB | 20 dB | \(\Delta\)Avg. | − 5 dB | 0 dB | 5 dB | 10 dB | 15 dB | 20 dB | \(\Delta\)Avg. |
Unprocessed | 3.79 | 8.84 | 12.84 | 17.84 | 22.84 | 27.84 | - | 3.73 | 8.77 | 12.77 | 17.77 | 22.77 | 27.77 | - |
CRN | 11.24 | 14.77 | 17.49 | 20.90 | 24.24 | 27.43 | 3.68 | 10.88 | 14.36 | 17.09 | 20.52 | 23.97 | 27.25 | 3.41 |
MSTCN | 9.79 | 13.22 | 15.74 | 18.51 | 20.71 | 22.23 | 1.03 | 9.44 | 12.88 | 15.41 | 18.26 | 20.60 | 22.27 | 0.88 |
LSTM-IRM | 11.69 | 15.33 | 18.20 | 21.83 | 25.47 | 29.18 | 4.61 | 11.47 | 15.03 | 17.90 | 21.58 | 25.45 | 29.45 | 4.55 |
GCRN | 12.71 | 16.06 | 18.64 | 21.79 | 24.81 | 27.55 | 4.59 | 11.91 | 15.27 | 17.89 | 21.09 | 24.22 | 27.01 | 3.97 |
GaGNet | 12.27 | 15.62 | 18.30 | 21.55 | 24.79 | 28.09 | 4.43 | 12.15 | 15.49 | 18.13 | 21.54 | 25.02 | 28.68 | 4.57 |
Conv-TasNet | 13.91 | 17.10 | 19.55 | 22.61 | 25.73 | 28.99 | 5.65 | 13.09 | 16.33 | 18.89 | 22.18 | 25.65 | 29.32 | 5.31 |
DCCRN | 13.65 | 17.22 | 20.04 | 23.44 | 26.87 | 30.45 | 6.28 | 12.74 | 16.22 | 18.96 | 22.43 | 25.93 | 29.68 | 5.39 |
DPCRN | 13.33 | 16.78 | 19.45 | 22.76 | 26.01 | 29.11 | 5.57 | 12.92 | 16.32 | 19.02 | 22.40 | 25.80 | 29.25 | 5.35 |
SA-MSTCN\(^{1}\) | 13.34 | 16.68 | 19.33 | 22.64 | 26.01 | 29.51 | 5.58 | 13.45 | 16.74 | 19.24 | 22.69 | 26.32 | 30.08 | 5.82 |
SA-MSTCN\(^{2}\) | 13.66 | 17.05 | 19.63 | 22.91 | 26.26 | 29.62 | 5.85 | 13.45 | 16.74 | 19.47 | 22.87 | 26.46 | 30.15 | 5.92 |