exact word extraction dictionary 全字匹配抽取字典
word part extraction dictionary 全字部分匹配抽取字典
exact word part extraction dictionary 全字部分匹配抽取字典
Targeting at extending the dictionary for word segmentation so as to improve its accuracy, this paper presents a high-frequency Chinese word extraction algorithm based on information entropy.
为扩展分词词典,提高分词的准确率,本文提出了一种基于信息熵的中文高频词抽取算法,其结果可以用来识别未登录词并扩充现有词典。
Word alignment is a basic problem of Cross-lingual Natural Language Processing. Many NLP tasks based on bilingual corpus such as SBMT, EBMT, WSD, Automated Dictionary Extraction need to align words.
词语对齐是跨语言自然语言处理领域的一个基本问题,许多基于双语语料库的应用(如sbmt、EBMT、WSD、词典编纂)都需要词汇级别的对齐。
应用推荐