In the next sections, we will discuss polymorphic data storage capabilities of Green-
plum that helps combine the best of the two worlds in a seamless manner.
Parallel versus distributed computing/processing
Parallel systems have been there for a while now and the new paradigm that has
gained traction in the Big Data world is distributed systems. In this section, let us ex-
plore how the parallel and distributed systems conceptually compare and contrast.
To understand parallel systems, we will use a simple taxonomy, Flynn's taxonomy
(1966). He classified parallel systems using two streams, data streams and instruc-
tion streams. The following figure is a representation of Flynn's taxonomy:
• Single Instruction Single Data ( SISD ): This is a case of a single processor
with no parallelism in data or instructions. A single instruction is executed on
single data in a sequential manner. For example, uniprocessor.
• Multiple Instruction Single Data ( MISD ): In this, multiple instructions oper-
ate on a single data stream; a typical example can be fault tolerance.
• Single Instruction Multiple Data ( SIMD ): This is a case of natural parallel-
ism; a single instruction triggers operation on multiple data streams.
• Multiple Instructions Multiple Data ( MIMD ): A case where multiple in-
dependent instructions operate on multiple and independent data streams.
Since the data streams are multiple, the memory can either be shared or dis-
tributed. Distributed processing can be categorized here. The previous figure
depicts MIMD and a variation in a distributed context.
One of the critical requirements of parallel/distributed processing systems is high
availability and fault tolerance. There are several programming paradigms to imple-
ment parallelism. The following list details the important ones: