EURASIP Journal on Audio, Speech, and Music Processing

Table 8 Average STOI (%) scores of compared methods for noisy and enhanced speech under various SNR conditions

From: Supervised Attention Multi-Scale Temporal Convolutional Network for monaural speech enhancement

	No RIR							RIR
SNR	− 5 dB	0 dB	5 dB	10 dB	15 dB	20 dB	\(\Delta\)Avg.	− 5 dB	0 dB	5 dB	10 dB	15 dB	20 dB	\(\Delta\)Avg.
Unprocessed	83.1	88.4	91.7	94.7	96.8	98.1	-	77.1	83.9	88.2	92.3	95.2	97.1	-
CRN	86.7	91.1	93.5	95.4	96.8	97.6	1.4	82.2	87.7	90.7	93.4	95.3	96.5	2.0
MSTCN	87.0	91.4	93.8	95.9	97.3	98.1	1.8	83.2	88.7	91.7	94.4	96.3	97.6	3.0
LSTM-IRM	89.5	93.2	95.2	96.9	98.0	98.7	3.2	86.0	90.7	93.3	95.6	97.1	98.2	4.5
GCRN	87.7	91.7	93.8	95.6	96.7	97.5	1.7	83.4	88.5	91.3	93.7	95.3	96.4	2.4
GaGNet	89.5	91.6	93.9	95.8	97.1	98.0	2.2	83.3	88.6	91.5	94.2	95.9	97.2	2.8
Conv-TasNet	89.7	93.2	95.0	96.6	97.6	98.4	2.9	85.4	90.1	92.6	95.0	96.6	97.7	3.9
DCCRN	89.3	93.1	95.1	96.8	97.9	98.6	3.0	85.5	90.4	93.0	95.4	96.9	98.0	4.2
DPCRN	89.2	92.9	94.9	96.6	97.7	98.5	2.9	86.1	90.8	93.3	95.6	97.1	98.2	4.5
SA-MSTCN\(^{1}\)	90.6	94.0	95.8	97.2	98.2	98.8	3.7	87.9	92.1	94.3	96.2	97.6	98.4	5.4
SA-MSTCN\(^{2}\)	90.7	942	95.9	97.3	98.3	98.8	3.8	87.9	92.1	94.4	96.3	97.6	98.5	5.5

Back to article page