首先,介绍了无监督词义消歧研究的意义。
First of all, the significance of unsupervised word sense disambiguation study is introduced.
词义消歧问题可以形式化为典型的分类问题。
The problem of word sense disambiguation can be formalized to be a typical classify problem.
提出了一种基于语义网络结构的词义消歧方法。
A word sense disambiguation method based on semantic graph structure was presented.
词义消歧是自然然语言处理中的一个难点和热点问题。
Word Sense Disambiguation (WSD) has always been a difficult and hot points in natural language processing.
神经网络的结构相对复杂,用于词义消歧需要先解决输入问题。
实验数据表明贝叶斯网络比神经网络更适合解决汉语词义消歧问题。
The experimental data shows that Bayesian network is fitter for solving the Chinese WSD than ANN.
本文通过实验考察了这两种网络模型在汉语词义消歧上的应用效果。
This paper investigates their application effect on Chinese Word Sense Disambiguation (WSD) by experiments.
通常把词义消歧作为模式分类问题进行研究,其中特征选择是一个重要的环节。
WSD is usually considered as an issue about pattern classification to study, which feature selection, is an important component.
初步的实验结果表明,该方法可以有效地进行汉语名词、动词、形容词的词义消歧。
The results of this study indicate that the SKCC is effective for word sense disambiguation in MT system and are likely to be important for general Chinese NLP.
基于NBM的无指导词义消歧正确率略低,但有很好的扩展性,值得进一步的研究。
And the unsupervised WSD based on NBM got a little lower precision in comparison to the supervised, but it is worthy further researching since it has a well extension performance.
研究的目的是对现有的无监督词义消歧技术进行总结,以期为进一步的研究指明方向。
The goal of this paper is to give a brief summary of the current unsupervised word sense disambiguation techniques in order to facilitate future research.
使用伪词可以避免有指导的词义消歧方法中的数据稀疏问题,充分验证词义分类器的实验效果。
Using pseudowords we can overcome data sparseness problem in supervised WSD and fully verify the experimental effect of word sense classifier.
目前进行的很多词义消歧研究多采用凡个多义词作为试验测试对象,在实际应用方面存在着局限性。
Only some ambiguous words are disambiguated objects in many word sense disambiguation researches at present. There practices have limitation in real application.
采用基于依存分析改进贝叶斯网络的无指导的机器学习方法对汉语大规模真实文本进行词义消歧实验。
The Word Sense Disambiguation (WSD) study based on large scale real world corpus is performed using an unsupervised learning algorithm based on DGA improved Bayesian Model.
双语语料库在基于实例的机器翻译、翻译知识的获取、双语词典的建立、词义消歧等领域有着重要的应用价值。
Bilingual corpus plays an important role in Example-base Machine translation (EBMT), acquirement of translation knowledge, construction of bilingual dictionary etc.
前者涉及到词法、句法、语义分析,包括汉语分词、词性标注、注音、命名实体识别、新词发现、句法分析、词义消歧等。
The former includes Chinese word segmentation, part - of - speech tagging, pinyin tagging, named entity recognition, new word detection, syntactic parsing, word sense disambiguation, etc.
前者涉及到词法、句法、语义分析,包括汉语分词、词性标注、注音、命名实体识别、新词发现、句法分析、词义消歧等。
The former includes Chinese word segmentation, part-of-speech tagging, pinyin tagging, named entity recognition, new word detection, syntactic parsing, word sense disambiguation, etc .
译文消歧及与之相似的在单语范畴内的词义消歧一直是自然语言处理领域基础研究课题,它也是自然语言处理技术的重点和难点之一。
WTD and its similar task - word sense disambiguation (WSD) in mono-lingual category are important and hard in the research of nature language processing (NLP) and are always the basis of it.
基于相对词频,提出语境计算模型,用于对汉语文本词义进行消歧。
Based on RWF, Context Calculation Model is put forward to resolute word sense ambiguity in Chinese text.
其优点在于两个方面:1不受词义标注语料库规模的影响;2对特定词语意义的消歧准确率可达到100%。
The striking advantages of the feature-based approach are 1 it is not influenced by the data size, and 2 it can disambiguate some specific words with precision of 100%.
其优点在于两个方面:1不受词义标注语料库规模的影响;2对特定词语意义的消歧准确率可达到100%。
The striking advantages of the feature-based approach are 1 it is not influenced by the data size, and 2 it can disambiguate some specific words with precision of 100%.
应用推荐