Decision tree-based acoustic models for speech recognition

EURASIP Journal on Audio, Speech, and Music Processing

Table 3 WER (%) and the number of parameters of triphone forest DTAM systems on the 1992 WSJ non-verbalized 5K closed-test set

System	Number of trees	% WER	Number of parameters
Non-forest (MFCC)	1	12.9	766k
Non-forest with gender information (MFCC)	1	11.9	770k
Non-forest (MCMS)	1	13.3	798k
Non-forest (MCMS + MFCC concatenated)	1	12.5	707k
MCMS + MFCC	2	10.7	1500k
Acoustic partitioning (MFCC)	4	10.9	747k
Speaker clustering (MFCC)	4	11.9	806k