Fig. 5From: Deep learning-based expressive speech synthesis: a systematic review of approaches, challenges, and resourcesLabels represented as multiple separated layers, the shared layers are trained with data from all emotions, the emotion specific layers are trained with emotion related data onlyBack to article page