Database Reference
In-Depth Information
2013-05-03 14:52:19
2013-08-03 14:52:19|INFO|started gpfdist -p
8082 -P 8083 -f "/home/master/SAMPLE.csv" -t 30
2013-08-03 14:52:19|INFO|running time: 0.25
seconds
2013-08-03 14:52:19|INFO|rows Inserted
= 4
2013-08-03 14:52:19|INFO|rows Updated
= 0
2013-08-03 14:52:19|INFO|data formatting errors
= 0
2013-08-03 14:52:19|INFO|gpload succeeded
Note
The gpload program processes the control file document in order and uses in-
dentation to demarcate the hierarchy. White spaces and tabs usage are restric-
ted.
Hadoop (HD) data loading options
We will now look at ways to load data into Hadoop. To handle unstructured data pro-
cessing and analytics, Greenplum provides a commercial Hadoop distribution with
some proprietary integration pieces built to work with Greenplum Database, Chorus,
and Command Center.
Sqoop 2
In this section, we will explore an option for data loading and unloading requirements
for Hadoop with Sqoop API. Sqoop is a framework that ships with Hadoop and forms
apartofHadoopecosystemaslistedin Chapter2 , Greenplum Unified Analytics Plat-
form (UAP) . This section is not meant to be a tutorial for Sqoop, but is intended to
introduce the readers to this concept.
Data can be loaded independently into Hadoop using Sqoop API. As databases are
not vastly accessible by Hadoop, Apache Hadoop was added to Hadoop ecosys-
Search WWH ::




Custom Search