提出了一种基于贝叶斯分类与机读词典的多义词排歧方法,通过小规模语料库的训练和歧义词在机读词典中的语义定义来完成歧义的消除。
A method based on the bayes and machine readable dictionary was proposed, which could disambiguate by the training of a small-scale corpus and the definition of semantic in machine dictionary.
基于领域词典的文本特征表示方法可以增强文本特征表示能力。
Domain-dictionary based text representation can enhance the ability of text feature expression and reduce the feature dimensionality.
首先将互联网上的语料分为混合语料和非平行语料,对于混合语料采用基于启发式规则的方法进行词典抽取,正确率达到了82%;
We apply the rule-based method to extract the bilingual dictionary from the mixed multi-language document, and got the precise of 82%.
基于语料库的词典编纂技术已经成为现代词典编纂的主流方法。
The technology of lexicography based on corpus has been the main method for contemporary lexicography.
本文汇总了多种基于语义词典的方法,全面地概括分析了这类方法的特点。
This paper collected multiple methods based on semantic dictionary, roundly summed up and analyzed the character of these methods.
实验证明,该方法与基于语言学词典的相似性测度方法相比,更接近用户对文本相似性的判断。表10。图5。参考文献10。
Experiments prove that this method is more close to users judgment on text similarities, compared with the similarity measurement method based on linguistic dictionaries. 10 tabs. 5 figs. 10 refs.
基于编辑距离和多种后处理的生物医学文献实体名识别方法通过“全称缩写对识别算法”扩充词典,利用编辑距离算法提高识别召回率。
In order to enhance the robustness of LTSA algorithm, an outlier detection method based on the improved distance is presented in this paper.
基于编辑距离和多种后处理的生物医学文献实体名识别方法通过“全称缩写对识别算法”扩充词典,利用编辑距离算法提高识别召回率。
In order to enhance the robustness of LTSA algorithm, an outlier detection method based on the improved distance is presented in this paper.
应用推荐