Database Reference
In-Depth Information
SETTING USER IDENTITY
The user identity that Hadoop uses for permissions in HDFS is determined by running the whoami com-
mand on the client system. Similarly, the group names are derived from the output of running groups .
If, however, your Hadoop user identity is different from the name of your user account on your client
machine, you can explicitly set your Hadoop username by setting the HADOOP_USER_NAME environ-
ment variable. You can also override user group mappings by means of the ha-
doop.user.group.static.mapping.overrides configuration property. For example,
dr.who=;preston=directors,inventors means that the dr.who user is in no groups, but
preston is in the directors and inventors groups.
You can set the user identity that the Hadoop web interfaces run as by setting the ha-
doop.http.staticuser.user property. By default, it is dr.who , which is not a superuser, so
system files are not accessible through the web interface.
Notice that, by default, there is no authentication with this system. See Security for how to use Kerberos
authentication with Hadoop.
With this setup, it is easy to use any configuration with the -conf command-line switch.
For example, the following command shows a directory listing on the HDFS server run-
ning in pseudodistributed mode on localhost:
% hadoop fs -conf conf/hadoop-localhost.xml -ls .
Found 2 items
drwxr-xr-x - tom supergroup 0 2014-09-08 10:19 input
drwxr-xr-x - tom supergroup 0 2014-09-08 10:19 output
If you omit the -conf option, you pick up the Hadoop configuration in the etc/hadoop
subdirectory under $HADOOP_HOME . Or, if HADOOP_CONF_DIR is set, Hadoop config-
uration files will be read from that location.
NOTE
Here's an alternative way of managing configuration settings. Copy the etc/hadoop directory from your
Hadoop installation to another location, place the *-site.xml configuration files there (with appropriate
settings), and set the HADOOP_CONF_DIR environment variable to the alternative location. The main
advantage of this approach is that you don't need to specify -conf for every command. It also allows
you to isolate changes to files other than the Hadoop XML configuration files (e.g., log4j.properties )
since the HADOOP_CONF_DIR directory has a copy of all the configuration files (see Hadoop Configur-
ation ).
Search WWH ::




Custom Search