Database Reference
In-Depth Information
Transformation and aggregation : Data can be transformed in a
number of ways on both the SQL Server and Hadoop platforms.
However, the Hadoop architecture enables you to distribute data
transformations over a cluster, which can dramatically speed up
transformation of large amounts of data.
The reverse is also true. You will find the need to extract data from Hadoop
and place it in SQLServer. Common scenarios for this include thefollowing:
Business analysis and reporting : SQL Server has more options and
more robust end-user tools for doing data exploration, analysis, and
reporting. Moving the data into SQL Server enables the use of these
tools.
Integration : The results of your Hadoop analytics, transformations,
and aggregation may need to be integrated with other databases in your
organization.
Quality/consistency : SQL Server, as a relational database, offers
more capabilities to enforce data quality and consistency rules on the
data it stores. It does this by enforcing the rules when the data is added
to the databases, giving you confidence that the data already conforms
to your criteria when you query it.
NOTE
SQL Server has more tools available today, but this is changing quickly.
More vendors are adding the ability to interact with Hadoop directly
into their tools, and the quality of the end-user experience is getting
better as the competition in this space increases.
Transferring Data Between Hadoop and SQL Server
One key consideration for moving data between Hadoop and SQL Server is
the time involved. With large data volumes, the data transfers often need
to be scheduled for times when other workloads on the relevant servers are
light.
Hadoop is optimized for batch data processing. Generally, when writing
data to Hadoop, you will see the best performance when you set up the
processing to handle large batches of data for import, instead of writing
Search WWH ::




Custom Search