频繁项集是挖掘流数据挖掘的基本任务。
Mining frequent items is a basic task in stream data mining.
挖掘事务库中的频繁项集是数据挖掘的重要任务之一。
The mining of frequent items in transactional database is an important task of data mining.
实验表明,该算法对于频繁项集挖掘具有比较高的效率。
The experiments show that FP-DFS has good efficiency in frequent item-set mining.
第二步是求基于多维的频繁项集的算法的实现及关联规则生成。
Secondly we mine multidimensional frequent items set and generate association rules.
该算法通过对模式树的各种操作简化了对频繁项集的搜索过程。
To make further improvement on the scalability of the algorithm, we make a further study on the pattern tree, and propose a new algorithm called FP-DFS based on the study.
频繁项集挖掘是一个非常基本的,但最重要的任务,在数据流处理。
Frequent items mining is a very basic but important task in the data stream processing.
在此基础上,给出了在伪装后的数据集上生成频繁项集的挖掘算法。
On this basis, the algorithm of generating frequent items from transformed data sets is proposed.
引入扩展频繁项集的概念,大大减小了检查频繁项集是否闭的搜索空间。
Furthermore, the concept of an expanded frequent itemset is introduced to greatly decrease the searching range for adjusting whether a frequent itemset is closed or not.
发现频繁项集是关联规则挖掘的主要途径,也是关联规则挖掘算法研究的重点。
Discovering frequent item sets is the main way of association rules mining, and it is also the focus of the study in algorithms for association rules mining.
应用该多层概要数据结构,实现了面向数据流的多层频繁项集的动态近似查找算法。
Applying the hierarchical sketch, an algorithm that finds hierarchical frequent items over data streams dynamically and approximately was implemented.
现有关联规则挖掘算法都是在频繁项集基础上进行挖掘,关于非频繁项集的资料很少。
The existing association rules mining algorithms are chiefly based on frequent itemsets, and the record about infrequent itemsets is very rare.
该算法显著减少了已有算法中产生频繁项集及扫描大规模数据库的操作,性能改善明显。
The performance of this algorithm is improved noticeably by reducing the operation of producing frequent item sets and scanning large scale databases.
由于随机哈希函数不可逆,目前的概要数据结构不得不遍历关键字地址空间以查找和估计频繁项集。
Due to the irreversibility of random hash mapping, current sketch data structures have to traverse the key address space to find frequent items.
其利用关联规则得到的频繁项集实时地匹配用户的当前访问序列,对不同的用户提供不同的推荐资源。
It matched user's active access sequence with frequent item sets made by association rules in real time and offered different resources to different users.
传统的挖掘频繁项集的并行算法存在数据偏移、通信量大、同步次数较多和扫描数据库次数较多等问题。
There were problems in traditional parallel algorithms for mining frequent itemsets more or less: data deviation, large scale communication, frequent synchronization and scanning database.
该方法克服了传统关联规则挖掘方法的不足,在产生频繁项集的同时进行规则挖掘,从而提高了挖掘效率。
This method conquers the disadvantage of traditional association rules mining methods, mining rules while mining frequent-item set, so the mining efficiency is greatly enhanced.
因此提出了最大支持度的概念,用来约束频繁项集的挖掘,排除没有意义的关联规则同时也提高了挖掘的效率。
Therefore a concept named maximum support is introduced, which is used to bind the frequent items mined, and exclude meaningless association rules.
此外,由于树结构在挖掘频繁项目时不需要产生频繁项集及对这些频繁项进行测试而被广泛应用于数据挖掘中。
Besides, tree structure is extensively adopted in data mining because it doesn't need to generate the frequent items and test them.
采用项集格生成树的数据结构,将最大频繁项集挖掘过程转化为对项集格生成树进行深度优先搜索获取所有最大频繁节点的过程。
The itemset lattice tree data structure was adopted to translate maximal frequent itemsets mining into the process of depth-first searching the itemset lattice tree.
算法在产生了频繁1-项集之后,分别利用1-项集中的项作为约束条件,建立压缩FP-树,挖掘跨事务关联规则。
After the frequent 1-itemsets is produced, it separately uses them as constraint conditions to construct compact FP-tree and to mine inter-transactional association rules.
针对频繁闭项集挖掘算法中数据结构与处理机制复杂的问题,提出窗口快速滑动的数据流频繁闭项集挖掘算法——MFWSR。
This paper proposes an algorithm of Mining Frequent closed itemsets with Window Sliding Rapidly(MFWSR) against the complexity of data structure and process for determination.
针对频繁闭项集挖掘算法中数据结构与处理机制复杂的问题,提出窗口快速滑动的数据流频繁闭项集挖掘算法——MFWSR。
This paper proposes an algorithm of Mining Frequent closed itemsets with Window Sliding Rapidly(MFWSR) against the complexity of data structure and process for determination.
应用推荐