没有分区,将当前数据放到可以处理的正确位置需要许多开销,这种开销是随着机器的增加呈指数(不是线性)增长的。
Without partitioning, much overhead is incurred from getting the current data to the right place for processing, and this overhead grows exponentially -- not linearly -- as more machines are added.
该应用程序的多个实例可以使用并行作业管理器(PJM)来进行分派,其中各个分区处理不同的数据部分。
Multiple instances of that application can be dispatched using the Parallel Job Manager (PJM), where each partition processes a different section of the data.
在上例中,关键数据是用户配置文件,可以跨一组机器分布(分区)以进行并行处理。
In the initial example above, the key data was user profiles, which could be distributed (partitioned) across a set of machines for parallel processing.
本文描述的高级并行机制、数据和分区关联以及其他技术都提供了一组重要工具,使用这些工具可以帮助简化脱机处理。
Advanced parallelism, data and partition affinity, and other techniques described in this article provide an essential set of tools that can help streamline offline processing.
每个数据库分区上的数据库管理器使用每个系统上的处理器来管理数据库中属于该分区的那部分数据。
The processors on each system are used by the database manager at each database partition to manage its own part of the total data in the database.
如果数据库分区和范围分区一同使用,那么表内的行首先会跨数据库分区做散列处理,然后再在每个数据库分区中执行范围分区。
If database partitioning and range partitioning are used together, the rows in a table are first hashed across the database partitions and then range-partitioned within each database partition.
对于这种多租户的架构,数据和配置被虚拟分区,以使每个客户组织都能处理一个虚拟的应用程序实例。
With a multi-tenant architecture, data and configuration is virtually partitioned to allow each client organization to work with a virtual application instance.
除了硬件成本之外,水平分区特性和需要企业版或者数据中心版本的许可,这两个版本对于每个处理器的零售价分别是27,495和54,990美元。
In addition to the hardware costs, the horizontal partitioning feature requires Enterprise or Datacenter licenses which have a retail price of 27,495 and 54,990 per processor.
范围分区可以满足这个管理需求,DB 2 9.7已经改进了对这种技术的支持,可以处理XML数据。
Range partitioning addresses this administrative requirement, and DB2 9.7 extends prior support for this technology to include XML data.
通过对上述两个表的连接键(pdb_id)进行散列处理来分布两个表也能确保给定pdbml文档中所有原子行均能作为本身的PDBML文档存储在相同的数据库分区中。
Distributing both tables by hashing on their join key (pdb_id) also ensures that all atom rows for a given PDBML document are stored in the same database partition as the PDBML document itself.
使用DPF,我们建议通过对pdb_id列的值进行散列处理来跨数据库分区分布pdbxml表和atom_site表中的数据。
Using the DPF, we recommend to distribute the data in the PDBXML table and atom_site table across the database partitions by hashing on the values of the pdb_id column.
db2pdbc:它处理来自远程节点的并行请求。(只用于分区数据库环境中)。
Db2pdbc: Handles parallel requests from remote nodes (used only in a partitioned database environment).
输入数据使用这样一种方法进行分区,即在并行处理的计算机集群中分区的方法。
Input data is partitioned in such a way that it can be distributed among a cluster of machines for processing in parallel.
而且,通过使用DPF,可以显著缩短备份和恢复时间,因为每台参与分区的机器需要处理的数据量更小了。
Furthermore, backup and recovery times can be significantly reduced when using DPF, due to the smaller amounts of data that each partitioning machine would be dealing with.
该连接器能够以并行或序列模式处理分区数据库。
The connector can work with partitioned databases in parallel or sequential mode.
这既包括缓存数据使其接近应用程序,又包括复制靠近(分区)数据的应用程序逻辑,以实现并行处理。
This includes both caching the data close to the application and replicating application logic close to the (partitioned) data for parallel processing.
而且因为切分是在应用程序层面进行的,您可以对不支持常规分区的数据库进行切分处理。
And because sharding is done at the application layer, you can do it for databases that don't support regular partitioning.
由于数据被划分在多个数据库分区上,因而可以使用多台计算机上的多个处理器的处理能力来满足对信息的请求。
Since the data is divided across database partitions, you can use the power of multiple processors on multiple computers to satisfy requests for information.
当处理一个查询时,请求也相应地被划分成多个部分,以便让各个数据库分区各自处理其负责的那些行。
When a query is processed, the request is divided so each database partition processes the rows that it is responsible for.
DPF可以通过增加数据库分区来提高处理能力,因此,随着表的增长,仍然可以保持较高的查询性能。
DPF can maintain consistent query performance as the table grows by providing the capability to add more processing power in the form of additional database partitions.
此外,通过跨多个数据库分区分散数据以使多个分区可并行处理分配到其上的数据,也能减少复杂的分析查询的响应时间。
Also, the response time of complex analytical queries can be reduced by spreading the data across multiple database partitions, such that all partitions work on their assigned data in parallel.
标准JPA没有处理分区或分配的有效方式,因为规范已经隐式地假定单个数据库作为存储库。
Standard JPA has no effective means of dealing with sharding or portioning as the specification implicitly assumes a single database as the repository.
如图4所示,DPF是一种物理数据库设计选项,它在一个多处理环境中使用多个单独的数据库分区。
As Figure 4 shows, DPF is a physical database design option that USES multiple separate database partitions in a multi-processing environment.
请参考“分区数据库环境中的MQT优于昵称限制”,了解关于如何处理该限制的提示。
Please follow the "Restrictions on MQTs over nicknames in a partitioned database environment" for tips on how to work around this restriction.
应将该值设置为大于1而小于等于数据库分区可使用的处理器数目。
This value should be set to greater than 1 to maximum to the number of processors that database partition can use.
在该例中,分区号是0,因为我们处理的是单分区数据库。
The partition number is 0 in this case because we are dealing with a single-partition database.
当哈希操作达到其最大递归级数并转换到替换计划以处理剩余的分区数据时发生哈希释放。
Hash bailout occurs when a hashing operation reaches its maximum recursion level and shifts to an alternate plan to process the remaining partitioned data.
启用了内部并行(intra - parallel)处理的未分区(non - partitioned)数据库。
Non-partitioned databases for which intra-parallel processing is enabled.
总之,在DataStage上下文中,平行处理是一个作业的内部特征,该作业通过被利用的数据或处理分区的数量来测量。
In summary, parallel processing, in the context of DataStage, is an internal characteristic of a job that is measured by the number of data or processing partitions that are utilized.
尽管Table Service可以支持大容量的数据,但由于只支持少量的数据类型,所以实际上还是有比较严格的限制,这就要求开发人员从一开始就需要考虑如何处理分区问题。
Despite their large size tables are actually pretty restrictive, for they only support a handful of data types are require developers to think about partitioning from the very beginning.
应用推荐