组合型歧义切分字段一直是汉语自动分词的难点,难点在于消歧依赖其上下文语境信息。
Combinational ambiguity is a challenging issue in Chinese word segmentation in that its disambiguation depends on the contextual information.
汉语不同于英语,词之间没有间隔标记。而汉语分词是文本分析的第一步,且存在歧义切分,因此分词问题成为汉语分析的首要难题。
Different from English, there are no interval marks between words in Chinese, so it is difficult for word segmentation to identify ambiguous words.
切分歧义是影响汉语自动分词系统精度的一个重要因素。
Segmentation Ambiguity is an important factor influencing accuracy of Chinese auto-segmentation system.
该文利用一种统计的方法来解决交集型歧义字段的切分。
This paper proposes a statistical method to solve overlapped ambiguity in Chinese words' segmentation.
歧义处理是影响分词系统切分精度的重要因素,是自动分词系统设计中的一个最困难也是最核心的问题。
Ambiguity processing is an important factor to determine the precise of a word segmenting system, and a most difficult and essential problem of automated word segmenting system.
系统包括初切分,词性标注、歧义字段处理、模型平滑、未登录词识别等功能模块。
The system includes some modules such as originally segmenting, POS tagging, ambiguity processing, model smoothing and Unknown Word Recognizing.
对汉语进行切分和标注,不可避免要产生歧义。
Ambiguities are be produced inevitably when Chinese is segmented and tagged.
本文在利用正向最大匹配方法和逆向最大匹配方法来对输入文本进行预切分,并通过双向扫描的方法检测歧义字段。
We segmented input text. utilizing the methods of Maximum Matching and Reverse Maximum Matching, and found ambiguous word through two-way-scan method.
目前学术界主要采用计算机自动分词解决中文文本分词,但是这种方法不能完全解决分词问题,这是因为计算机自动分词不能彻底地解决歧义字段的切分。
And now the most widely used method is automatic segmentation. But this method can't solve the problem thoroughly, because this method can't solve the problem of ambiguous segment.
实验1请被试比较歧义句中切分出来的所指名词与非所指名词的重读程度。
In experiment 1, the subjects were asked to determine the more stressed word between the demonstrative noun and the non-demonstrative noun.
自动切分过程中会出现许多歧义,例如下图中只有红色标记的切分结果是正确的。
When segmenting automatically, there may exist disambiguation. For instance, in the following figure, only the segmentation marked with the red color is correct.
自动切分过程中会出现许多歧义,例如下图中只有红色标记的切分结果是正确的。
When segmenting automatically, there may exist disambiguation. For instance, in the following figure, only the segmentation marked with the red color is correct.
应用推荐