Database Reference
In-Depth Information
FlightData = LOAD 'FlightPerformance.csv' using
PigStorage(',');
Another built-in load function is JsonLoader . This is used to load JSON
(JavaScript Object Notation)-formatted files. JSON is a text-based open
standard designed for human-readable data interchange and is often used
to transmit structured data over network connections. The following code
loads a JSON-formatted file:
FlightData = LOAD ' FlightPerformance.json' using
JsonLoader();
NOTE
For more information on JSON, see http://www.json.org/ .
You can also store data using the storage functions. For example, the
following code stores data into a tab-delimited text file (the default format):
STORE FlightData into 'FlightDataProcessed' using
PigStorage();
Functions used to evaluate and aggregate data include IsEmpty , Size ,
Count , Sum , Avg , and Concat , to name a few. The following code filters out
tuples that have an empty airport code:
FlightDataFiltered = Filter FlightData By
IsEmpty(AirportCode);
Common math functions include ABS , CEIL , SIN , and TAN . The following
code uses the CEIL function to round the delay times up to the nearest
minute (integer):
FlightDataCeil = FOREACH FlightData
GENERATE CEIL(FlightDelay) AS FlightDelay2;
Search WWH ::




Custom Search