Database Reference
In-Depth Information
$ gpfdist -d /var/load_files2 -p
8082 -l /home/gpadmin/log2 &
• To stop gpfdist when it is running in the background:
First find its process id:
$ ps -ef | grep gpfdist
Then kill the process, for example (where 3456 is the process ID in
this example):
$ kill xxxx
gpload
The gpload data loading utility is used to load data into Greenplum's external table
in parallel. gpload uses YAML formatted control file that has the following com-
mands/scripts to load data into the target database:
• Invoke the Greenplum parallel file server program ( gpfdist )
• Create an external table definition based on the source data defined
• Load the source data into the target table in the database according to gp-
load mode (insert, update, or merge)
It is important to note that with GPLOAD we have to deal with YAML, which is not
simple and requires skill. But, as it acts as a wrapper simplifying multiple implement-
ations into one, we can have parallel file-based external table setup with configura-
tion of the data format, external table definition, and gpfdist or gpfdists setup in
a single configuration file. It executes SQL against the external table. The external
temporary external table is dropped once the load gets completed.
For example, test.yml :
%YAML 1.1
---
Search WWH ::




Custom Search