Performance Tuning - Professional NoSQL - page 301

Databases Reference

In-Depth Information

16

Performance Tuning

WHAT'S IN THIS CHAPTER?

Understanding the factors that aff ect parallel scalable applications

Optimizing scalable processing, especially when it leverages the

MapReduce model for processing

➤

➤

Presenting a set of best practices for parallel processing

Illustrating a few Hadoop performance tuning tips

➤

➤

Today, much of the big data analysis in the world of NoSQL rests on the shoulders of the

MapReduce model of processing. Hadoop is built on it and each NoSQL product supporting

huge data sizes leverages it. This chapter is a fi rst look into optimizing scalable applications

and tuning the way MapReduce-style processing works on large data sets. By no means does

the chapter provide a prescriptive solution. Instead, it provides a few important concepts

and good practices to bear in mind when optimizing a scalable parallel application. Each

optimization problem is unique to its requirements and context and so providing one

universally applicable general solution is probably not feasible.

GOALS OF PARALLEL ALGORITHMS

MapReduce makes scalable parallel processing easier than it had been in the past. By adhering

to a model where data is not shared between parallel threads or processes, MapReduce creates

a bottleneck-free way of scaling out as workloads increase. The underlying goal at all times is

to reduce latency and increase throughput.

The Implications of Reducing Latency

Reducing latency simply means reducing the execution time of a program. The faster a program

completes — that is, the less time a program takes to produce desired results for a given set of

Next Page

Professional NoSQL

Search WWH ::

Custom Search

Home