要减少执行时间,除非规则是柱状的,否则,在可能的情况下采用虚拟表,并避免使用抽样。
To reduce execution time, unless the rules are columnar, you should use virtual tables where possible, and avoid using sampling.
图4展示了具有绑定的规则,它涉及来自多个表的列,这导致了在连接操作之后进行抽样。
Figure 4 shows a rule with bindings that involve columns coming from more than one table, which causes the sampling to be done after the join operation.
这样,就可以利用抽样来很有效地减少规则所评估的记录的数量,并减少输出量。
Thus, you can use a sampling to effectively reduce the number of records being evaluated by the rule itself by, and reduce the size of the output.
这确保了连接规则是正确的,但是抽样不会减少对连接资源必须的计算影响。
This ensures that the result of the join is correct, but the sampling doesn't reduce the computing effort necessary to join the sources.
与连接类似,如果在规则中定义了数据抽样,那么会在InfoSphereDataStage任务中执行。
Like joins, if a data sampling is defined within a rule, then it is done within the InfoSphere DataStage job.
抽样有一定的规则,是吗?
提出一种面向分类规则提取的分层抽样算法。
A stratify sampling algorithm for extracting classification rules is proposed in the dissertation.
抽样有一定的规则,是吗? ?
规则判断条件采用的是抽样平均可信度。同时,探讨了时间对关联规则挖掘产生的影响以及在导读系统中的应用。
Meanwhile this paper discusses the influence of time on associations rule mining and the application in guide reading system.
通过对比分析,说明不同的任务优先规则对随机抽样算法具有不同的影响,其中采用MINSLK等优先规则的随机抽样算法能够有效地缩短项目平均工期。
Different priority rules are adopted in the random sampling procedure, and a statistical test shows that they affect the performance of the proposed random sampling method.
通过对比分析,说明不同的任务优先规则对随机抽样算法具有不同的影响,其中采用MINSLK等优先规则的随机抽样算法能够有效地缩短项目平均工期。
Different priority rules are adopted in the random sampling procedure, and a statistical test shows that they affect the performance of the proposed random sampling method.
应用推荐