Database Reference
In-Depth Information
Replicated_Size
In Figure 10.4 , you can see that we have six compute nodes and one
replicated file group per compute node. The replicated size option allows
you to size how big each replicated filegroup will be on each compute node.
This value is measured in gigabytes (GB) and can be expressed as a decimal.
Therefore, a replicated size of 10 will allocate 10GB on each compute node
to the replicated filegroup.
Whatever value we specify for the replicated size, this is the value that
will be used for all six replicated filegroups. There is no option to have
different sizes. Data held in a replicated table is, in effect, a copy on every
compute node, which is why the amount actually allocated is multiplied by
the number of nodes you have. In reality, when specifying replicated size,
you simply need to decide how much capacity you need to hold all the data
once . PDW actually takes care of the actual allocation and the replication of
the table. The fact that we will be holding it six times is really an internal
optimization.
The replicated filegroup is only used by PDW when we decide to create a
table and specify that we wish to replicate it. Because these tables tend to
be our dimensions, we typically expect to see a relatively small size allocated
here. If someone had allocated terabytes rather than gigabytes, that would
warrant serious investigation.
Distributed_Size
The distributed size value is used differently from the replicated size. The
distributed size value provided in the CREATE DATABASE statement is
actually evenly split across all the distributions to ensure we have the same
space available in each bucket for distributed data. Data is then spread
across the distributions with each distribution holding a distinct subset of
the distributed table data. Again, the value supplied in the DDL statement is
also measured in gigabytes and can be expressed as a decimal.
If you remember back to the toast analogy, we also want to spread it as
evenly as possible over the appliance. Therefore, if we have 6 compute
nodes in our PDW appliance, we also have a total of 48 distributions (8
distributions per node A-H). If I allocate 10TB as my distributed size, I am
therefore allocating 1/48 of this to each distribution. In this example, each
distribution would be allocated approximately 213.33GB of the available
Search WWH ::




Custom Search