Targeting at extending the dictionary for word segmentation so as to improve its accuracy, this paper presents a high-frequency Chinese word extraction algorithm based on information entropy.
为扩展分词词典,提高分词的准确率,本文提出了一种基于信息熵的中文高频词抽取算法,其结果可以用来识别未登录词并扩充现有词典。
Targeting at extending the dictionary for word segmentation so as to improve its accuracy, this paper presents a high-frequency Chinese word extraction algorithm based on information entropy.
为扩展分词词典,提高分词的准确率,本文提出了一种基于信息熵的中文高频词抽取算法,其结果可以用来识别未登录词并扩充现有词典。
应用推荐