本文研究并实现了基于机器学习的分词系统。
The system of Chinese word segmentation based on machine learning is researched and implemented.
摘要词典是许多中文分词系统的一个重要的组成部分。
Abstract the dictionary mechanism serves as one of the important components in a lot of Chinese word segmentation systems.
切分歧义是影响汉语自动分词系统精度的一个重要因素。
Segmentation Ambiguity is an important factor influencing accuracy of Chinese auto-segmentation system.
提出了一种基于语词的分词系统,设计了相应的分词词典。
A new word segmentation system based on phraseology is proposed, and the corresponding word segmentation lexicon is designed.
论文介绍了一个基于词频统计的中文分词系统的设计和实现。
The paper introduces the design and implementation of Chinese word segmentation system, which is based on statistic the frequency of the word.
论文的核心工作是设计并实现了一个基于多步处理策略的汉语自动分词系统。
The core work of the paper is designing and implementing a Chinese auto-segmentation system based on a multi-step processing strategy.
本文介绍了一个已研制成功的新闻语料自动分词系统—NEWS的结构和功能。
News, the structure and function of the automatically segmenting system, for news corpora, has been developed.
为扩展分词知识库,提高自动分词能力,本文提出了一种基于自学习机制的汉语自动分词系统。
To extend word segmentation repository and enhance word segmentation capacity, a Chinese word segmentation system based on automatic learning is proposed in this paper.
基于理解的分词方法研究尚未成熟,所以,绝大部分中文分词系统是应用机械统计相结合的方法。
Because the last direction is not mature, most systems adopt the strategy which contains dictionary and statistics.
文本分类有助于用户有选择地阅读和处理海量文本,因此其预备工作分词系统的研究是很有意义的。
Text classification is helpful for user to read and handle vast amounts of texts selectively, whose preliminary work-the research of word segmentation is significative.
歧义处理是影响分词系统切分精度的重要因素,是自动分词系统设计中的一个最困难也是最核心的问题。
Ambiguity processing is an important factor to determine the precise of a word segmenting system, and a most difficult and essential problem of automated word segmenting system.
根据以上分析,我们提出了一种基于记忆的处理策略,可有效改善实用型非受限汉语自动分词系统的精度。
As a consequence, we propose a memory-based strategy that is expected to improve the performance of practical Chinese word segmenters significantly.
通过与分词系统实验结果相比,验证了该方法的有效性。 (2)多策略的领域概念上下位关系学习方法。
Comparative exrepriemts show that the proposed approach can give satisfactory results(2) Proposing a multi-strategy method of learning domain-specific hyponymous relations.
目前,研究人员猜测梦是大脑情感自动调节系统的组成部分,当大脑处于“掉线”状态时对情绪进行调整。 【析句】Suspect后面是that引导的宾语从句;逗号之后的现在分词短语regulating moods…作定语,解释thermostat 的意思,句末的while引导一个时间状语从句。
Now researchers suspect that dreams are part of the mind's emotional thermostat, regulating moods while the brain is “off-line”.
实验结果显示选取适当的特征数目、使用好的分词技术、使用命名实体识别技术都能改进中文话题追踪系统的性能。
The experimental result shows that the tracking performance will be improved by selecting proper features, using word segmentation system and using named entity recognition system.
对于海量信息处理的应用,分词的速度是极为重要的,对整个系统的效率有很大的影响。
The rate of text participle is most important especially in applied in great information handling, and it affects the efficiency of whole system.
设计和实现了汉语数据库自然语言查询接口系统(IDCQ),系统包括正则分词子系统和对象语义解析子系统;
The design and implementation of the Interface for Database Query in Chinese (IDCQ). The system includes regular word segmentation subsystem and object semantic analysis subsystem.
文章首先构造了自动答疑系统架构,改进了中文分词算法,并利用领域本体库和语句相似度设计了该系统。
In this paper, we first construct the system architecture, improve the Chinese text segmentation algorithm, then, by making use of domain ontology base and sentence similarity, design the system.
在分词技术、索引技术、结构化查询语言技术的基础上,提出了一个基于XML文档数据库的信息检索系统,这一系统模型主要由分词模块、索引模块及查询模块组成。
This paper puts forward an information retrieval system based on XML documents database on the foundation of segmentation technology, index technology and structured query language technology.
实现该系统时,引入了搜索引擎的架构模型,即网络蜘蛛、索引器和检索器,并且加入了分词和搜索自动提示功能。
In order to implement the system, we introduce the architecture of search engine: web spider, indexer and searcher. We also add the functions of segmentation and the keyword tips.
提出了一种面向网络答疑系统的无词典分词方法。
A segmentation algorithm without dictionary based on network-oriented natural language question answering system is proposed.
本文系统地介绍了中文分词技术和模式匹配理论,依据他们的理论算法设计出解决地址数据匹配整合流程体系的方法。
The article introduces the Chinese partition technology and the pattern matching theory, designs a solution for address data matching and conformity way in systematic flow according to the arithmetic.
它是综合的技术处理系统,其设计与开发需要分词、词法分析、检索、实体识别、答案抽取等几个方面的技术支撑。
It is an integrated technical processing system which is supported by the technique of word segment, lexical analysis, retrieval, entity recognition and answer extraction.
对于基于词的搜索引擎等中文处理系统,分词速度要求较高。
The speed of Chinese word segmentation is very important for many Chinese NLP systems, such as web search engines based on words.
系统首先对待切分词使用有限状态自动机进行分析。
In this paper, the authors first use FSM to analyze the stemming words.
在进行词性标注时,作者分析了前人的基于规则的词性标注的工作,并提出了基于规则优先级的词性标注方法,最后实现了分词和标注系统。
In Chinese corpus tagging, we have analyzed forefathers' rule based work, and have proposed the method based on rule PRI, finished the work of word segmenting and Chinese corpus tagging finally.
文中论述了在开发中文信息检索系统中所涉及到的两项关键技术,即中文分词技术和检索技术。
Two key techniques in the development of Chinese Information Retrieval System are discussed in this paper, i. e., Chinese word segmentation and search technique.
文中论述了在开发中文信息检索系统中所涉及到的两项关键技术,即中文分词技术和检索技术。
Two key techniques in the development of Chinese Information Retrieval System are discussed in this paper, i. e., Chinese word segmentation and search technique.
应用推荐