Database Reference
In-Depth Information
After the data has been identified, you want to operationalize the solution.
This means building the extract, transform, and load (ETL) layer to move
data from Hadoop to your data warehouse. Here is where we will likely
deviate from your traditional processes of using a tool like SQL Server
Integration Services(SSIS)todotheETL.Thereasonforthisisthatwehave
already loaded the data onto an immensely powerful data processing engine
named Hadoop. Therefore, a recommended approach is to do the transform
directly on Hadoop and stage the data in Hadoop in the form needed to
move to your data warehouse solution.
This is where Pig proves to be incredibly powerful and the right tool for the
job. A Pig script that takes your semistructured data and moves the data
through a data flow and creates a structured staging table is the efficient
way to take advantage of your 8-, 32-, or 128-node Hadoop cluster. This
is certainly more efficient than pulling all of that data off of Hadoop into
a single SSIS server to transform the data for inserting into the data
warehouse. Once the data has been staged in Hadoop, you can use Sqoop to
move the data directly from Hadoop into the data warehouse tables where
the data is needed.
Backups and High Availability in Your Big Data
Environment
How well can your organization withstand the loss of your big data
environment? Do you need 100% uptime? What happens if a natural
disaster affects your data center, making it no longer viable? These are
questions every IT shop should be asking itself when building their big
data environment. The answers to these questions will influence your
high-availability and disaster recovery solutions.
High Availability
High availability is the approach taken to ensure service level availability
willbemetduringacertainperiodoftime.Ifuserscannotaccessthesystem,
it is unavailable. Thus, any high-availability approach that you design for
your big data solution should be to ensure that your users can access the
system up to and beyond the service levels you have determined are
appropriate for that system.
Search WWH ::




Custom Search