本文提出了一种快速汉语自动分词算法。
A fast algorithm for Chinese words automatic segment is put forward in this paper.
分词算法的具体实现。
We also describe the realization of the new segment arithmetic.
两次分词算法,根据句子在段落的标题和第一被提出。
The algorithm of twice word segmentation based on the title and first-sentences in paragraphs is brought forward.
实验证明,该分词算法对专业文献的分词处理具有较好速度和准确性。
The experimental results show that the segmentation algorithm has good speed and accuracy in segmentation processing of professional literature.
论文中提出了一种基于常规最短路径方法的分词算法:层进式最短路径分词方法。
This article introduce a way called layer shortest path way based on conventional shortest path way.
并针对全切分分词算法进行了研究,给出了全切分分词方法算法中的并发检索模型。
Furthermore, according to the study on omni-segmentation, a model of parallel searching in word omni-segmentation algorithm is given.
实验结果表明:在相同条件下,基于二字词检测位图表的最大匹配分词算法较原算法分词速度更快。
The experimental result shows: under the same condition, the improved maximum match algorithm that based on the two-word bitmap has fastened segmentation speed than original algorithm.
对中文文本挖掘中的词汇处理技术进行了较深入的探讨,提出了针对汉语语言特点的无词典分词算法。
The dealing technology of words in Chinese text mining is discussed, and an arithmetic of "no Dictionary Cutting word" is brought forward.
文章首先构造了自动答疑系统架构,改进了中文分词算法,并利用领域本体库和语句相似度设计了该系统。
In this paper, we first construct the system architecture, improve the Chinese text segmentation algorithm, then, by making use of domain ontology base and sentence similarity, design the system.
文章对传统的反序分词词典进行了改进,设计了反序词典词根HASH表,并给出了相应的分词算法,实验表明,改进是有效的。
HASH toot table is designed based on a new converse segmentation dictionary, which is a great improvement on traditional segmentation dictionary.
本文首先对中文文本分词进行了介绍,在常用分词算法的基础之上设计了一种双向匹配分词算法,有效的减少了歧义词对正确分词的影响。
In this paper, Chinese word segmentation is introduced first, and then algorithm named two-way matching term is designed, which effectively reduces the ambiguity of the Chinese words.
其中主要的工作包括:1大规模中文信息处理是构建中文搜索引擎的基本环节,为了实现大规模中文信息处理,本文提出了一种改进的中文分词算法。
The major work includes:1 Propose an improved Chinese word segmentation algorithm for large-scale Chinese information processing, which is the basic phase of the building of Chinese search engine.
论文主体部分对分词中的歧义排除算法作了研究。
We study algorithm of solving ambiguity at the main part of the paper.
本文利用前向神经网络的交叉覆盖算法,通过对文本进行分词的预处理后,实现文本的自动分类。
Based on the Crossing Cover Algorithm of forward neural network, this paper realizes the automatic classification of texts after the preprocessing of the texts.
本文系统地介绍了中文分词技术和模式匹配理论,依据他们的理论算法设计出解决地址数据匹配整合流程体系的方法。
The article introduces the Chinese partition technology and the pattern matching theory, designs a solution for address data matching and conformity way in systematic flow according to the arithmetic.
给出了一种汉语分词有向图的快速生成算法。
A fast algorithm for generating Chinese word segmentation digraph was given.
为扩展分词词典,提高分词的准确率,本文提出了一种基于信息熵的中文高频词抽取算法,其结果可以用来识别未登录词并扩充现有词典。
Targeting at extending the dictionary for word segmentation so as to improve its accuracy, this paper presents a high-frequency Chinese word extraction algorithm based on information entropy.
分析现有几种中文分词方法,提出一种关键词抽取算法。
This paper analyzes several existing Chinese word segmentation methods, brings out a keywords extraction algorithm which according to the weight formula.
基于前缀树和动态规划,该算法提高了中文分词速度,同时保持了相对较高的分词准确性。
Using prefix tree and dynamic programming, this algorithm boosts the speed of Chinese word segmentation and guarantees relatively high precision.
切分过程系统利用改进正向最大匹配算法,提高了分词切分效率。
Maximum match method is optimized to improve the speed of the system during the word segmentation.
索引模块中:首先,讨论了中文分词的设计思想,选择了分词的算法。
Index module: first of all, discuss the design method of Chinese word segmentation and choose a word segmentation algorithm.
最初,它是以开源项目Luence为应用主体的,结合词典分词和文法分析算法的中文分词组件。
Initially, it is based on the application of the main open source project Luence, the combination of sub-word dictionary and grammar of Chinese word segmentation algorithm components.
使用该算法可以消除大量歧义,取得较好的分词效果。
It can avoid a great lot segmenting ambiguities with better segmenting results.
使用该算法可以消除大量歧义,取得较好的分词效果。
It can avoid a great lot segmenting ambiguities with better segmenting results.
应用推荐