Databases Reference
In-Depth Information
The distinct advantages of this architecture approach include:
All the data is stored locally across the nodes, and each node manages its portion of the data and
query assigned to it by the master node.
Data is striped and mirrored across two nodes at a minimum, which increases scalability when
large query workloads are submitted.
The advantage of data being mirrored is that it helps in achieving workload balance and can
support failure in case of an unplanned outage.
When the need arises for scalability the nodes can be used to divide the work in discrete chunks,
and if needed, we can simply add more nodes that can be configured and used by the system with
minimal intervention.
A node can be assigned a specific role or set of roles to be available for querying, loading, and
managing data.
The appliance architecture in a nutshell is a specialized configuration of multiple SMP nodes into
one physical device with a custom operating system layer added to a Linux or Unix platform, which
is managed by a smart controller and has its own internal network switch to move large data across
the nodes, bypassing the outside network completely. Due to its self-managing nature, administra-
tors or database administrators (DBAs) find minimal needs for intervention to maintain sustained
performance and scalability. Appliances also provide the flexibility to deploy commodity hardware
platforms, which lower the cost of operation and can increase time to market. The lower price point
enables appliance users to add more nodes as needed without breaking the bank.
The other aspect of the appliance that is worth exploring and understanding before you launch on
selecting an appliance or migrating to an appliance is the data architecture. The appliance can support
third normal form (3NF), star schema, or hybrid data architecture depending on the user's needs. The
data distribution and data storage techniques create the magic of scalability with workloads and users,
which we discuss next.
Data distribution in the appliance
Figure 9.2 shows a typical data distribution across the data warehouse appliance. From this figure we
see that data is distributed across multiple nodes, and in addition to this, typically nodes 1, 3, and 5
will mirror data slices, nodes 2, 4, and 6 will mirror data slices across, and nodes 7 and 8 are standby
for usage if there is an outage with the other nodes.
This type of data layout definitely needs the designer or architect to:
Understand the data and the special requirements for handling data.
Understand the underlying relationships.
Understand the data skew.
Understand the data volume.
Understand the data growth.
Once you have the data architecture mapped, the distribution of data, including the striping and
mirroring, will create the boost needed for performance, which comes with data availability in more
than one storage location, the optimization of the workload to execute on noncompeting infrastruc-
ture, and the minimal amount of data movement within the infrastructure.
Search WWH ::




Custom Search