This paper focuses on extracting translation pairs from unaligned Chinese English bilingual corpora.
本文主要研究基于未对齐的汉英双语库翻译对抽取。
Based on bilingual corpora, the algorithm can produce words-similarity-matrix through machine learning.
该算法能基于双语语料,通过机器学习来自动进行语义聚类,生成词间相似度矩阵。
Aiming at bilingual corpora is critical resources for developing statistical machine translation system, this paper presents a method which automatically mines bilingual parallel Web page form Web.
针对双语语料是开发统计机器翻译系统的重要资源,提出一种从网络中自动挖掘双语平行网页的方法。
The research on the bilingual dictionary extraction based on parallel corpora is an important direction.
基于平行语料抽取双语词典是一个很重要的研究方向。
As important elementary resources, bilingual parallel corpora play a crucial role in research of artificial intelligence.
作为一项重要的基础资源,双语平行语料库在人工智能领域的研究起着举足轻重的作用。
An algorithm for the automatic extraction of a bilingual term lexicon from English Chinese parallel corpora is proposed in this paper.
本文提出了一种从英汉平行语料库中自动抽取术语词典的算法。
However, access to a large-scale bilingual parallel corpus is not easy, the existing parallel corpora can not meet the actual needs in terms of the scale, timeliness and balance of the fields.
但是大规模双语平行语料库的获取并不容易,现有的平行语料库在规模、时效性和领域的平衡性等方面还不能满足处理真实文本的实际需要。
However, access to a large-scale bilingual parallel corpus is not easy, the existing parallel corpora can not meet the actual needs in terms of the scale, timeliness and balance of the fields.
但是大规模双语平行语料库的获取并不容易,现有的平行语料库在规模、时效性和领域的平衡性等方面还不能满足处理真实文本的实际需要。
应用推荐