分析和实验表明,该算法适合于海量数据查询并能有效地解决机群并行环境下数据偏斜所造成的查询性能低下的问题。
The analysis and experiment results show that this algorithm effectively resolves the data skew problem in Computer Cluster. It can be fit for searching in the massive data.
该方法首先通过在加权最小二乘 支持向量机的基础上加入对数据偏斜的处理,解决了元 信息 分类时关键词特征稀疏和样本高度不均衡问题;
Since the feature of the meta-information classification keywords is sparse and the distributing of sample is unbalanced, this thesis considered the factor of data skew based on LS-VSM.
应用推荐