Database Reference
In-Depth Information
write) operations are sent first to the namenode, which sends an HTTP redirect to the cli-
ent indicating the datanode to stream file data from (or to).
The second way of accessing HDFS over HTTP relies on one or more standalone proxy
servers. (The proxies are stateless, so they can run behind a standard load balancer.) All
traffic to the cluster passes through the proxy, so the client never accesses the namenode
or datanode directly. This allows for stricter firewall and bandwidth-limiting policies to be
put in place. It's common to use a proxy for transfers between Hadoop clusters located in
different data centers, or when accessing a Hadoop cluster running in the cloud from an
external network.
The HttpFS proxy exposes the same HTTP (and HTTPS) interface as WebHDFS, so cli-
ents can access both using webhdfs (or swebhdfs ) URIs. The HttpFS proxy is started
independently of the namenode and datanode daemons, using the httpfs.sh script, and by
default listens on a different port number (14000).
C
Hadoop provides a C library called libhdfs that mirrors the Java FileSystem interface
(it was written as a C library for accessing HDFS, but despite its name it can be used to
access any Hadoop filesystem). It works using the Java Native Interface (JNI) to call a
Java filesystem client. There is also a libwebhdfs library that uses the WebHDFS interface
described in the previous section.
The C API is very similar to the Java one, but it typically lags the Java one, so some new-
er features may not be supported. You can find the header file, hdfs.h , in the include dir-
ectory of the Apache Hadoop binary tarball distribution.
The Apache Hadoop binary tarball comes with prebuilt libhdfs binaries for 64-bit Linux,
but for other platforms you will need to build them yourself by following the
BUILDING.txt instructions at the top level of the source tree.
NFS
It is possible to mount HDFS on a local client's filesystem using Hadoop's NFSv3 gate-
way. You can then use Unix utilities (such as ls and cat ) to interact with the filesystem,
upload files, and in general use POSIX libraries to access the filesystem from any pro-
gramming language. Appending to a file works, but random modifications of a file do not,
since HDFS can only write to the end of a file.
Consult the Hadoop documentation for how to configure and run the NFS gateway and
connect to it from a client.
Search WWH ::




Custom Search