双语术语相似度计算在跨语言信息检索等领域有重要的应用。
Bilingual terminology similarity is one of the focuses of research in the fields of Cross-language Information Retrieval and so on.
如果你回顾余弦相似度的教科书的定义,你会发现它是在一个查询和文档的相应术语权重的产品的总和,归一化。
If you review the textbook definition of cosine similarity, you'll find that it's the sum of products of corresponding term weights in a query and a document, normalized.
这表明,计算余弦相似度,你只需要把那些文件,有一些术语通常与查询。
It follows that, to compute cosine similarity, you only need to consider those documents that have some term in common with the query.
针对专利文献专业术语相对较多、形式规范、语言严谨的特点,本文提出了一种基于伪lcs的句子相似度计算方法。
Aiming at the characteristics of patent documents, this paper presents a computing method of sentence similarity based on pseudo-LCS.
针对专利文献专业术语相对较多、形式规范、语言严谨的特点,本文提出了一种基于伪lcs的句子相似度计算方法。
Aiming at the characteristics of patent documents, this paper presents a computing method of sentence similarity based on pseudo-LCS.
应用推荐