Database Reference
In-Depth Information
readFully()
methods will read
length
bytes into the buffer (or
buffer.length
bytes for the version that just takes a byte array
buffer
), unless the end of the file is
reached, in which case an
EOFException
is thrown.
All of these methods preserve the current offset in the file and are thread safe (although
FSDataInputStream
is not designed for concurrent access; therefore, it's better to
create multiple instances), so they provide a convenient way to access another part of the
file — metadata, perhaps — while reading the main body of the file.
Finally, bear in mind that calling
seek()
is a relatively expensive operation and should
be done sparingly. You should structure your application access patterns to rely on stream-
ing data (by using MapReduce, for example) rather than performing a large number of
seeks.
Writing Data
The
FileSystem
class has a number of methods for creating a file. The simplest is the
method that takes a
Path
object for the file to be created and returns an output stream to
write to:
public
FSDataOutputStream
create
(
Path f
)
throws
IOException
There are overloaded versions of this method that allow you to specify whether to forcibly
overwrite existing files, the replication factor of the file, the buffer size to use when writ-
ing the file, the block size for the file, and file permissions.
WARNING
The
create()
methods create any parent directories of the file to be written that don't already exist.
Though convenient, this behavior may be unexpected. If you want the write to fail when the parent dir-
ectory doesn't exist, you should check for the existence of the parent directory first by calling the
ex-
ists()
method. Alternatively, use
FileContext
, which allows you to control whether parent direct-
ories are created or not.
There's also an overloaded method for passing a callback interface,
Progressable
, so
your application can be notified of the progress of the data being written to the datanodes:
package
org
.
apache
.
hadoop
.
util
;
public interface
Progressable
{
public
void
progress
();
}