Efficient bandwidth extension of musical signals using a differentiable harmonic plus noise model

EURASIP Journal on Audio, Speech, and Music Processing

Table 3 LSD performance of the evaluated models for monophonic and polyphonic real-world datasets. Best models are shown in bold for each dataset. The last two columns show CPU inference time expressed as real-time percentage and the number of parameters of the models

Model	Log-spectral distance				Inference time	# of parameters
	Monophonic datasets		Polyphonic datasets		Inference time
	OrchideaSOL	Medley-solos-db	MedleyDB	Gtzan	(% real-time)
Null	15.9	18.53	24.37	33.84	0	0
SBR [13]	9.27	8.78	11.15	12.96	2	0
ResNet [12]	14.04	15.65	16.17	26.84	48	55M
DDSP-noise	7.20	8.28	8.96	10.06	3	3.5M
DDSP-mono-dec	5.68	8.09	8.98	9.95	9	4.4M
DDSP-mono-dec-cyclic	/	/	11.57	11.60	44	4.4M
DDSP-poly-dec	/	/	9.53	10.31	9	7.5M