Database Reference
In-Depth Information
Table 16.1 Import Options for DistCp
Flag Description
-p[rbugp] Preserves r : Replication number b : Block size u : User g :
Group p : Permission
Ignores failures
-i
Outputs logs to <logdir>
-log
<logdir>
Maximum number of maps for copying of data
-m
<num_maps>
If source size different from destination size, it will get
overwritten. The default is not to overwrite as long as the
file exists.
-update
Big Data Solution Governance
As you your big data environment develops, you must mature your ability to
govern the solution. Preferably, you will have a team dedicated to governing
the environment. What does governance mean in this instance? These team
members are the gatekeepers of Hadoop. Specifically, this team will be
designed to manage the rules of engagement for your Hadoop clusters. The
kind of questions that they will answer include the following:
• What are the use cases that go onto the cluster?
• How do users apply for access to the cluster?
• Who owns each set of data?
• What are the user quotas?
• What are our regulatory requirements around information life cycle
management?
• What are our security and privacy guidelines?
• How do external systems interact with the system?
• What should be audited and logged?
If there are new questions about the rules of engagement for the cluster,
the big data governance team will answer those questions and set up the
 
Search WWH ::




Custom Search