From: W2VC: WavLM representation based one-shot voice conversion with gradient reversal distillation and CTC supervision
Method
WER (\(\downarrow\))
CER (\(\downarrow\))
VQ-VAE
20.21
10.30
CTC-VQ-VAE
2.99
1.08
FragmentVC
72.85
46.10
W2VC
1.63
0.57
Ground truth
1.30
0.48