From: Developing a unit selection voice given audio without corresponding text
Percent units used | Posterior probability and duration zscore thresholds | |||
---|---|---|---|---|
 | Test data = Olive | Test data = Intro. to Public Speaking | ||
 | ASR trained on Olive data | ASR trained on LibriSpeech | ASR trained on lecture data | ASR trained on LibriSpeech |
100 | – | – | – | – |
 | 1.00, ≈97 % | 1.00, ≈92 % | 1.00, ≈93 % | 1.00, ≈65 % |
50 | 1.00, ± 0.51 | 1.00, ± 0.70 | 1.00, ± 0.57 | 1.00, ± 0.98 |
30 | 1.00, ± 0.35 | 1.00, ± 0.45 | 1.00, ± 0.39 | 1.00, ± 0.60 |