One of challenges in Chinese Word Segmentation is the combinational ambiguity problem with two main obstacles: the detection of combinational ambiguities and ambiguity resolution.
汉语自动分词中组合歧义是难点问题,难在两点:组合歧义字段的发现和歧义的消解。
Combinational ambiguity is a challenging issue in Chinese word segmentation in that its disambiguation depends on the contextual information.
组合型歧义切分字段一直是汉语自动分词的难点,难点在于消歧依赖其上下文语境信息。
Overlapping ambiguity is a major type of ambiguity in Chinese word segmentation.
交集型分词歧义是汉语自动分词中的主要歧义类型之一。
In this paper, Chinese word segmentation is introduced first, and then algorithm named two-way matching term is designed, which effectively reduces the ambiguity of the Chinese words.
本文首先对中文文本分词进行了介绍,在常用分词算法的基础之上设计了一种双向匹配分词算法,有效的减少了歧义词对正确分词的影响。
The concept of relative word frequency (RWF) is proposed. A context calculation model is set up, which makes use of contextual information to resolute covering ambiguity in Chinese word segmentation.
提出了相对词频的概念,据此建立了语境计算模型,利用歧义字段前后语境信息对组合型分词歧义进行消解。
We make the following research:in lexical analysis phrase, we insert computer word list based on general segmentation dictionary, exclude word ambiguity;
本文在此阶段做了如下工作:在通用分词词典的基础上,加入计算机专业词汇,排除了词类歧义;
We make the following research:in lexical analysis phrase, we insert computer word list based on general segmentation dictionary, exclude word ambiguity;
本文在此阶段做了如下工作:在通用分词词典的基础上,加入计算机专业词汇,排除了词类歧义;
应用推荐