Database Reference
In-Depth Information
The following commands will work on an HDInsight cluster using the
standard HDFS implementation as well as ASV (discussed in Chapter
13, “Big Data and the Cloud”). However, you need to adjust the paths
for each case. To reference a path in the locally attached distributed file
system, use hdfs://<namenodehost>/<path> as the path. To
reference a path in ASV, use
<path> as the path. You can change the asv prefix to asvs to use an
encrypted connection.
By default, HDInsight creates the directories listed in Table 5.1 during the
initial setup.
Table 5.1 Initial HDFS Root Directories
Directory used by Hive for data storage (see Chapter 6,
“Adding Structure with Hive”)
Directory used for MapReduce
Directory for user data
You can list the root directories by using the ls or lsr command:
hadoop dfs -ls /
hadoop dfs -lsr /
ls lists the directory contents of the specified folder. In the example, /
indicates the root folder. lsr lists directory contents, as well, but it does it
recursively for each subfolder it encounters.
Normally, user files are created in a subfolder of the /user folder, with
the username being used for the title of the folder. However, this is not a
requirement, and you can tailor the folder structure to fit specific scenarios.
The following examples use a fictional user named MSBigDataSolutions .
Search WWH ::

Custom Search