Skip to main content

Table 4 STOI values of all baseline models under seen noises. Proposed model represented by bold and italic letters

From: Sub-convolutional U-Net with transformer attention network for end-to-end single-channel speech enhancement

Metric

STOI

Noise

Babble

Street

Restaurant

SNR (dB)

− 5

0

5

Avg.

− 5

0

5

Avg.

− 5

0

5

Avg.

Noisy mixture

56.75

62.04

68.55

62.45

55.41

61.26

69.17

61.95

58.57

66.47

73.15

66.06

Bi-LSTM [31]

67.79

74.22

78.57

73.53

67.26

71.35

77.14

71.92

68.26

76.43

80.76

75.15

Bi-CRN [34]

68.51

75.35

80.87

74.91

68.68

73.97

79.54

74.06

70.26

78.43

82.23

76.97

SEGAN [40]

69.93

76.69

81.69

76.10

69.59

75.03

81.19

75.27

72.26

79.13

83.08

78.16

GRN [30]

70.12

78.94

82.05

77.04

70.34

76.36

82.88

76.53

74.96

80.05

84.81

79.94

DCN [38]

72.09

80.11

83.77

78.66

71.26

78.91

83.68

77.95

75.19

81.74

85.64

80.86

DCCRN [35]

74.13

81.54

84.98

80.22

73.56

79.34

84.06

78.99

76.46

82.16

86.97

81.86

TSTNN [41]

75.41

83.16

86.53

81.70

74.59

81.23

85.14

80.32

77.39

83.69

87.56

82.88

MASENet [46]

77.32

84.04

87.15

82.84

76.68

82.24

86.85

81.92

78.18

84.47

88.18

83.61

SADNUNet [47]

78.51

86.43

88.11

84.35

77.61

84.41

87.56

83.19

79.52

86.63

90.21

85.45

MCGN [42]

80.31

87.78

90.03

86.04

78.54

85.69

88.94

84.39

80.13

88.33

91.83

86.76

DBT-Net [51]

80.92

88.03

91.12

86.69

79.64

86.71

89.40

85.25

81.22

89.24

92.47

87.64

TANSCUNet

82.62

89.71

92.72

88.35

81.69

88.12

91.94

87.25

82.56

90.81

93.86

89.08