Fig. 6From: Deep learning-based expressive speech synthesis: a systematic review of approaches, challenges, and resourcesBaseline model (1) for unsupervised approaches: Direct reference encoding integrated to an encoder-decoder TTS modelBack to article page