Skip to main content

Table 9 Denoising performance on test dataset without reverberation

From: Channel and temporal-frequency attention UNet for monaural speech enhancement

SNR

−5 dB

0 dB

5 dB

10 dB

15 dB

Avg.

−5 dB

0 dB

5 dB

10 dB

15 dB

Avg.

 

WB-PESQ

DNSMOS

Noisy

1.257

1.379

1.593

1.901

2.323

1.691

1.809

2.135

2.491

2.758

2.936

2.423

Uformer

1.588

1.761

1.192

2.008

2.060

1.723

2.711

2.876

2.994

3.047

3.054

2.936

MTFAA

1.575

1.831

2.153

2.485

2.822

2.173

2.695

2.920

3.101

3.228

3.310

3.051

CTFUNet

2.005

2.341

2.711

3.052

3.358

2.693

3.000

3.145

3.252

3.323

3.368

3.218

 

STOI (%)

SI-SDR

Noisy

77.117

84.300

90.089

94.036

96.674

88.443

0.925

5.903

10.953

15.860

20.872

10.903

Uformer

82.139

86.073

88.034

86.620

88.257

86.625

7.615

9.481

10.576

10.900

10.624

9.8392

MTFAA

83.204

89.440

93.281

95.608

96.970

91.701

6.245

9.805

12.674

14.843

16.393

11.992

CTFUNet

88.186

92.517

95.329

96.987

98.006

94.205

10.288

13.331

16.309

19.001

21.390

16.064