Finally, an algorithm for computing document similarity is presented, which filters abnormal information more efficiently.
在信息匹配算法方面,通过计算文档向量之间的相似度,实现网络信息的有效过滤。
This paper proposes a new method for XML document similarity computation based on the synthetical features of XML documents.
该文提出了一个新的基于综合语义的可扩展标记语言文档相似度计算方法。
In respect to the limitation of document similarity measuring based on VSM, this paper put forward an algorithm based on public substring of strings.
针对向量空间模型在文档相似度量方面的局限,提出了基于计算公共子串的文档相似度量算法。
TCUSS algorithm measures the document similarity by semantic similarity of concepts in concept lists, then clusters the document based on graph analysis, thus avoiding the restrict of clusters shape.
TCU SS算法利用两个概念列表中单词间的语义相似度作为文档间相近程度的度量,并以图为基础进行聚类分析,避免有些聚类算法对聚簇形状的限制。
In network information retrieval, based on document vector space, class, cluster, ranking and relevance feedback need to compute similarity.
在网络信息检索中,基于文档向量空间的分类、聚类、排序与相关性反馈需要计算相似度。
Application of this model to document retrieval is discussed which includes two aspects: navigating browse and criteria query based on semantic similarity calculation.
对模型在文献检索中的应用进行了初步探讨,包括两方面:导航浏览和基于语义相似度计算的条件查询。
Terms that occur in the document but not the query, or vice versa, have no effect on the similarity.
在文件中而不是查询中出现的词语,或反之亦然,在相似性没有影响。
If you review the textbook definition of cosine similarity, you'll find that it's the sum of products of corresponding term weights in a query and a document, normalized.
如果你回顾余弦相似度的教科书的定义,你会发现它是在一个查询和文档的相应术语权重的产品的总和,归一化。
A method for chararacter similarity calculation speeds up document retrieval by selecting a partial character string used for similarity calculation.
一种字符串相似度计算方法,通过选择相似度计算中使用的部分字符串,来进行文件检索的高速化。
Lastly, the document management table is rearranged in the decreasing order of the similarity and a document having high similarity is selected as a retrieval result from the database.
最后,以相似度高的顺序重新排列文件管理表,从数据库中选择相似度高的文件,作为检索结果。
The document number and the similarity are recorded in a pair in a document management table.
将文件号码和相似度形成组并记录在文件管理表中。
Concerning the keyword search in XML document, the meaningless query results are studied from two aspects: equivalence of content in element labels and similarity in element structure.
针对XML文档关键字搜索问题,从元素标签内容等价和元素结构相似性等价两个方面考虑无效的查询结果。
Then use Fuzzy C-means to do document clustering based on the results of similarity calculation above.
然后采用模糊c均值根据上述计算文档相似度的结果对文档进行聚类。
Sentence similarity computation is very important in all the fields of Natural Language Processing. In Multi-document Summarization Technology, sentence similarity computation is a key problem.
句子间相似度的计算在自然语言处理的各个领域都占有很重要的地位,在多文档自动文摘技术中,句子间相似度的计算是一个关键的问题。
Sentence similarity computation is very important in all the fields of Natural Language Processing. In Multi-document Summarization Technology, sentence similarity computation is a key problem.
句子间相似度的计算在自然语言处理的各个领域都占有很重要的地位,在多文档自动文摘技术中,句子间相似度的计算是一个关键的问题。
应用推荐