Information Technology Reference
In-Depth Information
spans a wide geographical area over multiple data centers, then a replica in the local data
center is preferred over a remote replica.
FIGUREĀ 9.6 Data replication and replica management
NameNode (filename, replicas, block-ids):
/home/user/data/part=0000, r:3, {2, 3, 4}
/home/user/data/part=0001, r:4, {1}
Replication
3
1
1
2
2
3
Network
4
1
2
4
3
4
1
2 DataNodes in rack 1
3 DataNodes in rack 2
EXERCISE 9.1
Adding, Removing, and Reading Data from HDFS
Before beginning this exercise, please make sure that Java and Hadoop are properly
installed and configured on your system (follow the instructions at http://hadoop
.apache.org/ ). A local single-node setup is sufficient for this exercise.
To add files to the HDFS, run the following commands:
$ hadoop dfs -mkdir /user/myuser/docs
$ hadoop dfs -put /home/myuser/docs/* /user/myuser/docs/
To remove a file:
$ hadoop dfs -rm /user/myuser/docs/resume_old.txt
To remove the directory:
$ hadoop dfs -rmr /user/myuser/docs/
The removed files are kept in the .Trash directory for each user. To delete files perma-
nently without sending them to the .Trash directory, run the following commands:
$ hadoop dfs -rm -skipTrash /user/myuser/docs/resume_old.txt
Search WWH ::




Custom Search