Supporting and Enhancing SAP CRM - Implementing SAP CRM

Database Reference

In-Depth Information

The context for all of these framework components is tightly coupled with the key characteris-

tics of a big data application—algorithms that take advantage of running lots of tasks in paral-

lel on many computing nodes to analyze lots of data distributed among many storage nodes.

Typically, a big data platform will consist of a collection (or a pool ) of processing nodes; the

optimal performances can be achieved when all the processing nodes are kept busy, and that

means maintaining a healthy allocation of tasks to idle nodes within the pool. Any big applica-

tion that is to be developed must map to this context, and that is where the programming model

comes in. The programming model essentially describes two aspects of application execution

within a parallel environment:

1. How an application is coded

2. How that code maps to the parallel environment

MapReduce programming model is a combination of the familiar procedural/imperative

approaches used by Java or C++ programmers embedded within what is effectively a functional

language programming model such as the one used within languages like Lisp and APL. The

similarity is based on MapReduce's dependence on two basic operations that are applied to sets or

lists of data value pairs:

1. Map, which describes the computation or analysis applied to a set of input key/value pairs to

produce a set of intermediate key/value pairs

2. Reduce, in which the set of values associated with the intermediate key/value pair output by

the map operation are combined to provide the results

A MapReduce application is envisioned as a series of basic operations applied in a sequence to

small sets of many (millions, billions, or even more) data items. These data items are logically

organized in a way that enables the MapReduce execution model to allocate tasks that can be

executed in parallel.

Combining both data and computational independence means that both the data

and the computations can be distributed across multiple storage and processing units

and automatically parallelized. This parallelizability allows the programmer to

exploit scalable massively parallel processing resources for increased processing speed

and performance.

14.10.4 SAP HANA

After the requisite background on big data and in-memory computing, we are now ready to get

acquainted with SAP HANA.

In 2006, SAP introduced the BW Accelerator (BWA), which was an appliance-based

solution specifically targeted to improve the reporting and analytic capabilities for its SAP

NetWeaver Business Warehouse (BW). The BWA solution is based on TREX (SAP's Search

and Classification Engine) technology to support querying the large amounts of BW data for

Search WWH ::

Custom Search

Home