Database Reference
In-Depth Information
Table 16-7. A selection of Pig's built-in functions
Category Function
Description
Eval
Calculates the average (mean) value of entries in a bag.
AVG
Concatenates byte arrays or character arrays together.
CONCAT
Calculates the number of non- null entries in a bag.
COUNT
Calculates the number of entries in a bag, including those that are null .
COUNT_STAR
Calculates the set difference of two bags. If the two arguments are not bags,
returns a bag containing both if they are equal; otherwise, returns an empty
bag.
DIFF
Calculates the maximum value of entries in a bag.
MAX
Calculates the minimum value of entries in a bag.
MIN
Calculates the size of a type. The size of numeric types is always 1; for
character arrays, it is the number of characters; for byte arrays, the number
of bytes; and for containers (tuple, bag, map), it is the number of entries.
SIZE
Calculates the sum of the values of entries in a bag.
SUM
Converts one or more expressions to individual tuples, which are then put in
a bag. A synonym for () .
TOBAG
Tokenizes a character array into a bag of its constituent words.
TOKENIZE
Converts an even number of expressions to a map of key-value pairs. A
synonym for [] .
TOMAP
Calculates the top n tuples in a bag.
TOP
Converts one or more expressions to a tuple. A synonym for {} .
TOTUPLE
Filter
Tests whether a bag or map is empty.
IsEmpty
Load/
Store
Loads or stores relations using a field-delimited text format. Each line is
broken into fields using a configurable field delimiter (defaults to a tab
character) to be stored in the tuple's fields. It is the default storage when
none is specified. [ a ]
PigStorage
Loads relations from a plain-text format. Each line corresponds to a tuple
whose single field is the line of text.
TextLoader
JsonLoader ,
JsonStorage
Loads or stores relations from or to a (Pig-defined) JSON format. Each
tuple is stored on one line.
Loads or stores relations from or to Avro datafiles.
AvroStorage
Search WWH ::




Custom Search