Database Reference
In-Depth Information
Once the files are in HDFS, you have a couple of ways to retrieve them.
One option is the
cat
command.
cat
displays the contents of the file to the
screen, or it can be redirected to another output device:
hadoop dfs -cat /user/MSBigDataSolutions/
SampleData1.txt
You can also use the
text
command to display information. The only
difference is that
text
attempts to convert the file to a text format before
displaying it. However, because most data in HDFS is text already,
cat
will
usually work.
To get the contents of a file back to the local file system from HDFS, use the
get
command:
hadoop dfs -get /user/MSBigDataSolutions/
SampleData1.txt
C:\MSBigDataSolutions\Output
Just like the
put
command,
get
can work with multiple files
simultaneously, either by specifying a folder or a wildcard:
hadoop dfs -get /user/MSBigDataSolutions/SampleData_*
C:\MSBigDataSolutions\Output
get
also has two related commands.
copyToLocal
works exactly like the
get
command and is simply an alias for it.
moveToLocal
also functions
like
get
, with the difference that the HDFS file will be deleted after the
specified file(s) are copied to the local file system.
Copying and moving files and directories within HDFS can be done with the
cp
and
mv
commands, respectively:
hadoop dfs -cp /user/MSBigDataSolutions /user/Backup
hadoop dfs -mv /user/MSBigDataSolutions /user/Backup2
You can delete a file in HDFS with the
rm
command.
rm
does not remove
directories, though. For that, you must use the
rmr
command: