Database Reference
In-Depth Information
Pig User-Defined Functions
Coded in Java, user-defined functions (UDFs) provide custom functionality that you can invoke from a Pig script. For
instance, you might create a UDF if you found that you needed to carry out an operation that the standard Pig Latin
language did not include. This section will provide an example of just such a function. You will examine the UDF Java
code, as well as the method by which it is built into a jar library. You will then use an extended version of the Pig script
from the last section that incorporates this UDF. You will learn how to incorporate both the jar file and its classes into
a Pig script.
As greater functionality was obtained for earlier Map Reduce jobs, such as removing unwanted characters from
the word-count process, the same will be done here. Using Java, you will create a UDF to remove unwanted characters,
so that the final word count is more precise. For instance, I have created a UDF build directory on the Linux file system
under /home/hadoop/pig/wcudfs that contains a number of files:
[hadoop@hc1nn wcudfs]$ pwd
/home/hadoop/pig/wcudfs/
[hadoop@hc1nn wcudfs]$ ls
build_clean_ws.sh build_lower.sh CleanWS.java
The Java files contain the code for UDFs while the shell scripts ( *.sh ) are used to build them. The CleanWS.java
file contains the following code:
01 package wcudfs;
02
03 import java.io.*;
04
05 import org.apache.pig.EvalFunc;
06 import org.apache.pig.data.Tuple;
07 import org.apache.hadoop.util.*;
08
09 public class CleanWS extends EvalFunc<String>
10 {
11 /*--------------------------------------------------------*/
12 @Override
13 public String exec(Tuple input) throws IOException
14 {
15 if (input == null || input.size() == 0)
16 return null;
17 try
18 {
19 String str = (String)input.get(0);
20
21 return str.replaceAll("[^A-Za-z0-9]"," ");
22 }
23 catch(IOException ioe)
24 {
25 System.err.println("Caught exception processing input row : "
26 + StringUtils.stringifyException(ioe) );
27 }
28
 
Search WWH ::




Custom Search