本文介绍了一个语音合成语料库。
包含本书中分析和处理的语言语料库。
This contains the linguistic corpora that are analyzed and processed in the book.
通过大型语料库(海量文本)来检查是个好方法。
操作大型语料库,设计语言模型,测试经验假设。
Manipulating large corpora, exploring linguistic models, and testing empirical claims.
语料库研究是建立在经验主义哲学的基础之上的。
一些语言语料库,例如布朗语料库,已经进行了。
Some linguistic corpora, such as the Brown Corpus, have been POS tagged.
生成语料库的原因之一是规范化文本并删除任何不相关的内容。
One of the reasons for generating a corpus is to normalize text and remove anything that isn't relevant.
由于不关心单词的大小写,所以您从内容创建的语料库全是小写的。
As you do not care about what case a word is in, you create the corpus out of the content, which is all lowercase.
大规模双语语料库的建设是进行基于语料库研究的基础。
Building a large scale of bilingual corpus is the foundation of research on corpus.
词语相似度的计算方法一般是利用大规模的语料库来统计。
The method of measuring similarity in general is based on the statistic of large sample base.
平行语料库研究是近年来语料库语言学横向发展的新趋势。
The research on parallel corpus is a new trend for corpus linguistics horizontal development.
若有牛津英语语料库,我或许就能从大量的语境中分辨出事实是否如此。
The Oxford English Corpus would have let me look at numerous contexts of the word to tease out whether this was the case.
基于语料库的研究方法为语言研究和外语教学提供了新的视角。
The research method with corpus as its basis provides a new angle of view for language study and foreign language teaching.
在这篇文章中,我们将介绍一种基于语料库的汉语句法分析系统。
In this paper, we will introduce a corpus-based Chinese parsing system.
大规模语料库中分词不一致现象普遍存在,并影响语料库的建设质量。
The phenomenon of segment inconsistency is universal in large-scale corpus, and affects the quality of corpus establishment.
要想判断一个词的流行程度,有一种方法就是查询拥有二十亿单词的牛津英语语料库。
One way to gauge the prevalence of a word is to consult the Oxford English Corpus, a body of 2 billion words.
语料库语言学以语料库为手段研究语言,是一门独具特色的语言研究学科。
As an important branch of linguistic investigation, corpus linguistics features language study through the processing of corpus data.
信息抽取是从自由文本语料库构建数据库,实现情报自动收集的有效途径之一。
Information extraction is a main approach for constructing database from free text corpus and for automatic collecting intelligence information.
论文以翻译模板对训练语料库机器译文评测分数的贡献为依据,对其进行评价。
In this paper, translation pattern is evaluated based on its contribution to machine translation assessment score of training corpus.
提出了一种从宾州中文语料库中自动提取词汇化树邻接文法(LTAG)的算法。
An algorithm of the extracting Lexicalized Tree Adjoining Grammar (LTAG) from Penn Chinese corpus was presented.
本文依据现有的研究成果,讨论语料库的语言研究及其在外语教学中的应用价值。
Based on the exiting fruits of relevant scientific research, the paper discusses the application of corpus in language study and foreign language teaching.
一些文本语料库进行了分类,例如通过类型或者主题;有时候语料库的类别相互重叠。
Some text corpora are categorized, e. g. , by genre or topic; sometimes the categories of a corpus overlap each other.
因此,对语料库加工时,必须对其进行一致性的检查和校正,保证语料库加工的质量。
So, while processing the corpus, we must check and collate the segmentation consistency to guarantee the corpus quality.
监督式分类器使用标签训练语料库来构建模型,预测基于特定要素输入的所输入的标签。
Supervised classifiers use labeled training corpora to build models that predict the label of an input based on specific features of that input.
20世纪下半叶,翻译研究与语料库语言学的融合为基于语料库的翻译研究奠定了基础。
In the late 20th century, the integration of translation study and corpus lays a foundation for the corpus - based translation study.
一些文本语料库进行了分类,例如经由过程类型或者主题;有时辰语料库的类别彼此重叠。
Some text corpora are categorized, e. g. , by genre or topic; sometimes the categories of a corpus overlap each other.
语料库的出现不仅标志着语言研究手段的技术进步,而且还标志着语言研究思想的重大转变。
The appearance of corpus marks not only the advance of technology in language research method, but also the great change in the thought of language research.
建模语料库中的语言数据可以帮助我们理解语言模型,并且可以用于进行关于新语言数据的预测。
Modeling the linguistic data found in corpora can help us to understand linguistic patterns, and can be used to make predictions about new language data.
建模语料库中的语言数据可以帮助我们理解语言模型,并且可以用于进行关于新语言数据的预测。
Modeling the linguistic data found in corpora can help us to understand linguistic patterns, and can be used to make predictions about new language data.
应用推荐