Database Reference
In-Depth Information
There can be many URLs specified and the number of URLs correspond to the num-
ber of URLs specified, and this corresponds to the number of segment instances that
work in parallel to access the web table.
The following is an example command to create a web external table from many
URLs:
CREATE EXTERNAL WEB TABLE test_table (id int,
name text, date date, description text)
LOCATION (
'http://abc.com/test1/file.csv',
'http://abc.com/test2/file.csv',
'http://abc.com/test3/file.csv'
)
FORMAT 'CSV' ( HEADER );
The following sections will explain different ways of loading data into Greenplum.
gpfdist
The gpfdist protocol provides the best parallel performance. It is a utility in Green-
plum and can be easily installed. gpfdist is responsible for ensuring optimal usage
of segments while running reads for external table. This utility is run on the server
where the external files are located. It can be used similar to the file:// protocol
shown in the preceding section to load the data into a regular external table from a
file source.
For example, the following command demonstrates loading data from text files that
are available on a remote server having gpfdist running on the ports 8081 and
8082 respectively:
CREATE EXTERNAL TABLE test_table (id int, name
text, date date, description text) LOCATION (
'gpfdist://localhost:8081/*.txt',
'gpfdst://localhost1:8082/*.txt') FORMAT 'TEXT'
(DELIMITER '|' );
Search WWH ::




Custom Search