Database Reference
In-Depth Information
processed. Prior to the STORE command, Pig had begun to build an execution plan
but had not yet initiated MapReduce processing.
Pig provides for the execution of several common data manipulations, such as
inner and outer joins between two or more files (tables), as would be expected
in a typical relational database. Writing these joins explicitly in MapReduce using
Hadoop would be quite involved and complex. Pig also provides a GROUP BY
functionality that is similar to the Group By functionality offered in SQL. Chapter
11 has more details on using Group By and other SQL statements.
An additional feature of Pig is that it provides many built-in functions that are
easily utilized in Pig code. Table 10.1 includes several useful functions by category.
Table 10.1 Built-In Pig Functions
Eval
Load/Store
Math String
DateTime
AVG
BinStorage() ABS
INDEXOF
AddDuration
CONCAT
JsonLoader
CEIL
LAST_INDEX_OF CurrentTime
COUNT
JsonStorage COS,
ACOS
LCFORST
DaysBetween
COUNT_STAR PigDump
EXP
LOWER
GetDay
DIFF
PigStorage
FLOOR REGEX_EXTRACT GetHour
IsEmpty
TextLoader
LOG,
LOG10
REPLACE
GetMinute
MAX
HBaseStorage RANDOM STRSPLIT
GetMonth
MIN
ROUND SUBSTRING
GetWeek
SIZE
SIN,
ASIN
TRIM
GetWeekYear
SUM
SQRT
UCFIRST
GetYear
TOKENIZE
TAN,
ATAN
UPPER
MinutesBetween
SubtractDuration
ToDate
Search WWH ::




Custom Search