Storage Provisioning and Networking - Deploying and Managing a Cloud Infrastructure - page 270

Information Technology Reference

In-Depth Information

spans a wide geographical area over multiple data centers, then a replica in the local data

center is preferred over a remote replica.

FIGURE 9.6 Data replication and replica management

NameNode (filename, replicas, block-ids):

/home/user/data/part=0000, r:3, {2, 3, 4}

/home/user/data/part=0001, r:4, {1}

Replication

3

1

1

2

2

3

Network

4

1

2

4

3

4

1

2 DataNodes in rack 1

3 DataNodes in rack 2

EXERCISE 9.1

Adding, Removing, and Reading Data from HDFS

Before beginning this exercise, please make sure that Java and Hadoop are properly

installed and configured on your system (follow the instructions at http://hadoop

.apache.org/ ). A local single-node setup is sufficient for this exercise.

To add files to the HDFS, run the following commands:

$ hadoop dfs -mkdir /user/myuser/docs

$ hadoop dfs -put /home/myuser/docs/* /user/myuser/docs/

To remove a file:

$ hadoop dfs -rm /user/myuser/docs/resume_old.txt

To remove the directory:

$ hadoop dfs -rmr /user/myuser/docs/

The removed files are kept in the .Trash directory for each user. To delete files perma-

nently without sending them to the .Trash directory, run the following commands:

$ hadoop dfs -rm -skipTrash /user/myuser/docs/resume_old.txt

Next Page

Deploying and Managing a Cloud Infrastructure

Search WWH ::

Custom Search

Home