Images, however, are not easily read by text-to-speech devices for people with visual disabilities.
但是,图像不大容易通过文本-语音设施读给有视觉障碍的人听。
This paper describes a Chinese text-to-visual speech synthesis system based on data-driven (sample based) approach, which is realized by short video segments concatenation.
给出一个基于数据驱动方法(基于样本方法)的汉语文本-可视语音合成系统,通过将小段视频拼接生成新的可视语音。
应用推荐