Database Reference
In-Depth Information
try
{
in
=
new
URL
(
args
[
0
]).
openStream
();
IOUtils
.
copyBytes
(
in
,
System
.
out
,
4096
,
false
);
}
finally
{
IOUtils
.
closeStream
(
in
);
}
}
}
We make use of the handy
IOUtils
class that comes with Hadoop for closing the stream
in the
finally
clause, and also for copying bytes between the input stream and the out-
put stream (
System.out
, in this case). The last two arguments to the
copyBytes()
method are the buffer size used for copying and whether to close the streams when the
copy is complete. We close the input stream ourselves, and
System.out
doesn't need to
be closed.
%
export HADOOP_CLASSPATH=hadoop-examples.jar
%
hadoop URLCat hdfs://localhost/user/tom/quangle.txt
On the top of the Crumpetty Tree
The Quangle Wangle sat,
But his face you could not see,
On account of his Beaver Hat.
Reading Data Using the FileSystem API
As the previous section explained, sometimes it is impossible to set a
URLStreamHandlerFactory
for your application. In this case, you will need to use
the
FileSystem
API to open an input stream for a file.
A file in a Hadoop filesystem is represented by a Hadoop
Path
object (and not a
java.io.File
object, since its semantics are too closely tied to the local filesystem).
You can think of a
Path
as a Hadoop filesystem URI, such as
hdfs://localhost/
user/tom/quangle.txt
.
FileSystem
is a general filesystem API, so the first step is to retrieve an instance for
the filesystem we want to use — HDFS, in this case. There are several static factory meth-
ods for getting a
FileSystem
instance:
public static
FileSystem
get
(
Configuration conf
)
throws
IOException
public static
FileSystem
get
(
URI uri
,
Configuration conf
)
throws
IOException
public static
FileSystem
get
(
URI uri
,
Configuration conf
,
String