That takes us to an important point that I wanted to secretly and slyly get across to everyone: Sometimes applying a data mining algorithm to your data will produce a bad model.
这也是我想审慎地告诉大家的一点:有时候,将数据挖掘算法应用到数据集有可能会生成一个糟糕的模型。
Classification (also known as classification trees or decision trees) is a data mining algorithm that creates a step-by-step guide for how to determine the output of a new data instance.
分类(也即分类树或决策树)是一种数据挖掘算法,为如何确定一个新的数据实例的输出创建逐步指导。
Under this kind of situation, a data mining algorithm EA(efficient algorithm) is proposed, which can fast calculate the confidence in association rules, it soluted the question how to efficient mi…
针对这种情况,提出了一种高效关联规则的挖掘算法EA,解决了在挖掘关联规则过程中如何高效挖掘满足最小置信度的关联规则问题。
Decision tree algorithms are applied to the data mining of the mammography classification, proposes a medical images classifier based on decision tree algorithm, the experiment results are given.
利用决策树算法对乳腺癌图像数据进行分类,实现了一个基于决策树算法的医学图像分类器,获得了分类的实验结果。
This thesis mainly discusses how to classify the potential customers with data mining algorithm and technology, by which there can be a correct orientation in the process of practical work.
本文主要讨论如何在信用卡一级代理过程中运用数据挖掘算法和技术对潜在客户进行分类,以便能在开展业务的过程中有所针对性。
This paper proposes a rough spectral clustering algorithm and apply the algorithm on text data mining.
该文提出了一种粗糙谱聚类算法,并将其应用于文本数据挖掘。
It often needs one to sift through mountains of data which a typical mining tool has missed because of the assumptions made in writing the mining algorithm.
它往往需要一个筛选数据的群山一个典型的挖掘工具,已经错过了,因为在写挖掘算法作出的假设。
This article proposes a data sorting method via the EM algorithm, for the purpose of mining high-quality decisions by performing data reasoning in a database with incomplete, noisy and uncertain data.
针对存在不完整、含噪声和不确定数据的数据库,通过挖掘高质量的决策,对数据库的数据进行推理,提出了一种基于EM算法的数据清理方法。
ID3 algorithm is a classical algorithm in data mining, this algorithm has the worse ability to resist noise.
ID 3算法是数据挖掘中经典的决策树分类算法,该算法具有抗噪声能力差的缺点。
It summarizes the main features of every algorithm by analyzing and comparing a variety of typical classifiers to provide a basis for selecting or improving the algorithms in data mining.
通过对当前数据挖掘中具有代表性的优秀分类算法进行分析和比较,总结出了各种算法的特性,为使用者选择算法或研究者改进算法提供了依据。
This thesis presents a data generalization algorithm based on data cube. The algorithm can clean the data for data mining and im-prove efficiency of data mining.
该文提出了一种基于数据立方体的数据泛化算法用于数据预处理,能够为数据挖掘提供良好的数据环境,提高数据挖掘的有效性。
This paper gives a quantitative analysis and comparative research which based on data mining for SCI2000 and Triple Helix Algorithm.
引荐了这方面的一个基于SCI2000的数据挖掘后,在三重螺旋算法下的定量规范性研究,并进行了相应国别区域比较分析。
In data mining, decision tree algorithm is a key research direction.
在数据挖掘中,决策树方法是一个重点研究方向。
Cluster analysis is a method of spatial data mining. Clustering algorithm can find some useful clustering structures directly from spatial data base.
聚类分析是空间数据挖掘的一种方法,聚类算法能从空间数据库中直接发现一些有用的聚类结构。
The third involves data experiment of mining algorithm on historical operating data of a 600mw unit for one month, and analysis of mining results under different conditions.
第三部分针对某600MW机组一个月的历史运行数据进行模式挖掘算法的数据实验,并分析了不同工况下的挖掘结果。
It is a necessary part of data mining of data pretreatment that cleaning and inducing data and providing object data for classification algorithm.
数据预处理是数据挖掘中不可或缺的一部分,是对数据进行初步地清理和归纳,为分类算法提供目标数据。
The paper mainly discusses a clustering algorithm based on density and grid in data mining, which has high clustering efficiency and low time complexity.
该文主要讨论数据挖掘中一种基于密度和网格的聚类分析算法及其在客户关系管理中的应用。
The algorithm is applied for similarity mining of the time series data of the electrical loads for a steel plant. The simulation results show the effectiveness of the algorithm.
将该算法用于某钢铁企业的电力负荷时序数据,计算结果表明了算法的有效性。
The way of generating frequent candidate a nd pruning technology are difficult technical problem when prenest traditional association rules mining algorithm is used to spatial data mining.
现有的传统关联规则挖掘算法构建频繁候选项的方式和修剪技术是其应用于空间数据挖掘的技术难题。
A frequent items mining algorithm of stream data (SW-COUNT) was proposed, which used data sampling technique to mine frequent items of data flow under sliding Windows.
提出了一种流数据上的频繁项挖掘算法(SW - COUNT)。该算法通过数据采样技术挖掘滑动窗口下的数据流频繁项。
The speed of mining outliers from dataset is slow. According to the characteristic of grid, fast outliers mining algorithm was proposed by partitioning the data into a set of units cell firstly.
针对数据集中离群数据的挖掘速度的问题,提出了快速的基于单元格的离群数据挖掘算法。
This paper describes the relevant concepts and presents a model of CBR based on dynamic data stream mining, and gives an improved clustering algorithm of data stream.
首先阐述了相关概念,接着提出了一种基于动态数据流挖掘的案例推理模型,其中动态数据流挖掘算法采用改进的数据流聚类算法。
CURE is a typical clustering algorithm that is designed for the mining of mass data.
CURE算法是针对大规模数据聚类算法的典型代表。
Methods decision tree algorithms are applied to the data mining of the mammography classification, proposes a medical images classifier based on decision tree algorithm.
方法利用决策树算法对乳腺癌图像数据进行分类,提出了一个基于决策树算法的医学图像分类器。
Among these three categories a possible common subtree algorithm is presented based on the enumeration tree technique of data mining domain.
本文深入探讨了每类算法中的代表算法,其中根据数据挖掘中枚举树相关技术提出了一种可能的公共子树查找算法的思想。
A new data-mining algorithm based on dynamic programming and dynamic time warping function was proposed and applied in technical analysis of stock market.
提出了一种基于动态规划和动态时间弯折函数的数据挖掘算法,并应用该算法对股市进行技术分析。
A new data-mining algorithm based on dynamic programming and dynamic time warping function was proposed and applied in technical analysis of stock market.
提出了一种基于动态规划和动态时间弯折函数的数据挖掘算法,并应用该算法对股市进行技术分析。
应用推荐