Database Reference
In-Depth Information
agreements with your user community will help determine what your
recovery strategy will be.
For example, if you incur a complete data center loss because of a natural
disaster, how important will it be to recover your Hadoop infrastructure? It
maybeverylowonyourprioritylist,behindmanyotherapplicationsinyour
organization. Applications such as the transactional systems that run your
business and create revenue will likely take priority. In addition, you might
have many other customer-facing applications that need to be up before you
need to have your big data solution up. You will likely also have applications
that your internal customers need to have available to service customers;
call centers and financial reporting come to mind. Your organization may
have other applications such as your e-mail system that needs to be up and
running so that communication may happen. Next, you might have a data
warehouse and BI environment that needs to be available to drive internal
reporting of month-end processes. Finally, your big data solution may need
to be re-created within a few weeks to drive your analytical decisions.
If you need to be backed up within a day, you need to have a different
disaster recovery plan. You can back up your Hortonworks Data Platform
(HDP) clusters to an external backup Hadoop cluster using DistCp. DistCp
stands for distributed copy and is a bulk data movement tool. By invoking
DistCp, you can periodically transfer HDFS data sets from the active
Hadoop cluster to the backup Hadoop cluster in another location.
To invoke DistCp, run the following command:
hadoop distcp hdfs://nn1:8020/foo/bar \
hdfs://nn2:8020/bar/foo
The namespace under /foo/bar on nn1 will expand into a temporary file,
partition its contents among a set of map tasks, and start a copy on each
TaskTracker from nn1 to nn2 .
You must use absolute paths with DistCp. Each TaskTracker must be able
to communicate with both the source and destination systems. It is
recommended thatyourunthesameversionprotocolsonbothsystems.The
system should be acquiesced at the source before invoking DistCp. If there
are clients writing to the source system while DistCp is being run, the copy
will fail.
Table 16.1 describes some of the import options for DistCp.
 
Search WWH ::




Custom Search