大规模数据集是数据挖掘高效实现的障碍。
Large data sets are becoming obstacles for efficient data mining.
提出了一种大规模数据集的训练样本选择方法。
A new method is proposed for sample selection in large data set.
BIRCH算法是针对大规模数据集的聚类算法。
BIRCH algorithm is a clustering algorithm for very large datasets.
因而,大规模数据集的交互渲染不能通过强力模型进行。
As a result, massive datasets cannot be interactively rendered by brute force methods.
算法的复杂度较低,适用于大规模数据集快速离群点检测。
The algorithm's complexity is low, it is suitable for quickly outlier detection of large data sets.
并发集合让大规模数据集的管理更加简单,并可以大量减少使用同步的需要。
Concurrent collections make it easier to manage large collections of data, and can greatly reduce the need for synchronization.
基于空间点集的连通性构造的等价关系,提出一种针对大规模数据集的快速分组算法。
Based on the equivalent relationship constructed from the connectivity, a fast grouping algorithm is presented for mass data set.
该异常检测方法关于数据集大小和属性个数具有近似线性时间复杂度,适合于大规模数据集。
The time complexity of the detection approach is nearly linear with the size of dataset and the number of attributes, which results in good scalability and adapts to large dataset.
大规模数据集的分类是数据挖掘中的一个重要课题,而分类预测技术在税收领域的应用有着很好的前景。
Classification of large database is an important data Ming problem, and the application of classification and prediction technologies on tax collection has a bright prospect.
实验结果表明:本文算法在计算时间和空间上具有一定的比较优势,对大规模数据集具有较强的可扩展性。
The results showed that our algorithm has a great superiority in both computing time and space, and consequently a stronger adaptability and operability for large scale data sets.
对于横向分片数据,Mnesia在伸缩性和低延迟事务上表现突出,接下来的一个挑战可能是对于超大规模数据集它如何伸展。
While Mnesia excels at scalability and low latency in transactions on horizontally fragmented data, one remaining challenge may be how it will scale in terms of very large datasets.
该算法显著减少了已有算法中产生频繁项集及扫描大规模数据库的操作,性能改善明显。
The performance of this algorithm is improved noticeably by reducing the operation of producing frequent item sets and scanning large scale databases.
该算法显著减少了已有算法中产生频繁项集及扫描大规模数据库的操作,性能改善明显。
The performance of this algorithm is improved noticeably by reducing the operation of producing frequent item sets and scanning large scale databases.
应用推荐