Databases Reference
In-Depth Information
• Requires no hand coding of programs to enable more processors
• Supports SMP, clustered, grid, and MPP platforms
In InfoSphere Streams, continuous applications are composed of individual
operators, which interconnect and operate on one or more data streams. Data
streams normally come from outside the system or can be produced internally as
part of an application. The operators may be used on the data to have it iltered,
classiied, transformed, correlated, and/or fused to make decisions using business
rules. Depending on the need, the streams can be subdivided and processed by a
large number of nodes, thereby reducing the latency and improving the process-
ing volumes.
The Netezza Performance Server (NPS ® ) system's architecture is a
two-tiered system designed to handle very large queries from multiple users.
The irst tier is a high-performance Linux ® symmetric multiprocessing host.
The host compiles queries received from Business Intelligence applications
and generates query execution plans. It then divides a query into a sequence of
subtasks, or snippets, which can be executed in parallel, and it distributes the
snippets to the second tier for execution. The host returns the inal results to
the requesting application, thus providing the programming advantages while
appearing to be a traditional database server. The second tier consists of dozens
to hundreds to thousands of Snippet Processing Units (SPUs) operating in paral-
lel. Each SPU is an intelligent query processing and storage node and consists
of a powerful commodity processor, dedicated memory, disk drive, and ield-
programmable disk controller with hard-wired logic to manage data lows and
process queries at the disk level.
The massively parallel, shared-nothing SPU blades provide the performance
advantages of massively parallel processors. Nearly all query processing is done
at the SPU level, with each SPU operating on its portion of the database. All
operations that easily lend themselves to parallel processing (including record
operations, parsing, iltering, projecting, interlocking, and logging) are performed
by the SPU nodes, which signiicantly reduces the amount of data moved within
the system. Operations on sets of intermediate results, such as sorts, joins, and
aggregates, are executed primarily on the SPUs but can also be done on the host,
depending on the processing cost of that operation.
A recent development in the scalability for databases is evident from IBM's
pureScale offering. Designed for organizations that run online transaction
processing (OLTP) applications on distributed systems, IBM ® DB2 ® pureScale ®
offers clustering technology that helps deliver high availability and exceptional
Search WWH ::




Custom Search