Database Reference
In-Depth Information
Path p = new Path ( "p" );
fs . create ( p );
assertThat ( fs . exists ( p ), is ( true ));
However, any content written to the file is not guaranteed to be visible, even if the stream
is flushed. So, the file appears to have a length of zero:
Path p = new Path ( "p" );
OutputStream out = fs . create ( p );
out . write ( "content" . getBytes ( "UTF-8" ));
out . flush ();
assertThat ( fs . getFileStatus ( p ). getLen (), is ( 0L ));
Once more than a block's worth of data has been written, the first block will be visible to
new readers. This is true of subsequent blocks, too: it is always the current block being
written that is not visible to other readers.
HDFS provides a way to force all buffers to be flushed to the datanodes via the
hflush() method on FSDataOutputStream . After a successful return from
hflush() , HDFS guarantees that the data written up to that point in the file has reached
all the datanodes in the write pipeline and is visible to all new readers:
Path p = new Path ( "p" );
FSDataOutputStream out = fs . create ( p );
out . write ( "content" . getBytes ( "UTF-8" ));
out . hflush ();
assertThat ( fs . getFileStatus ( p ). getLen (), is ((( long )
"content" . length ())));
Note that hflush() does not guarantee that the datanodes have written the data to disk,
only that it's in the datanodes' memory (so in the event of a data center power outage, for
example, data could be lost). For this stronger guarantee, use hsync() instead. [ 33 ]
The behavior of hsync() is similar to that of the fsync() system call in POSIX that
commits buffered data for a file descriptor. For example, using the standard Java API to
write a local file, we are guaranteed to see the content after flushing the stream and syn-
chronizing:
FileOutputStream out = new FileOutputStream ( localFile );
out . write ( "content" . getBytes ( "UTF-8" ));
out . flush (); // flush to operating system
out . getFD (). sync (); // sync to disk
assertThat ( localFile . length (), is ((( long ) "content" . length ())));
Closing a file in HDFS performs an implicit hflush() , too:
Search WWH ::




Custom Search