Database Reference
In-Depth Information
on
Tuple
. The field is an integer, so if it's not
null
, we cast it and check whether the
value is one that signifies the temperature was a good reading, returning the appropriate
value,
true
or
false
.
Example 16-1. A FilterFunc UDF to remove records with unsatisfactory temperature qual-
ity readings
package
com
.
hadoopbook
.
pig
;
import
java.io.IOException
;
import
java.util.ArrayList
;
import
java.util.List
;
import
org.apache.pig.FilterFunc
;
import
org.apache.pig.backend.executionengine.ExecException
;
import
org.apache.pig.data.DataType
;
import
org.apache.pig.data.Tuple
;
import
org.apache.pig.impl.logicalLayer.FrontendException
;
public class
IsGoodQuality
extends
FilterFunc
{
@Override
public
Boolean
exec
(
Tuple tuple
)
throws
IOException
{
if
(
tuple
==
null
||
tuple
.
size
() ==
0
) {
return false
;
}
try
{
Object object
=
tuple
.
get
(
0
);
if
(
object
==
null
) {
return false
;
}
int
i
= (
Integer
)
object
;
return
i
==
0
||
i
==
1
||
i
==
4
||
i
==
5
||
i
==
9
;
}
catch
(
ExecException e
) {
throw new
IOException
(
e
);
}
}
}
To use the new function, we first compile it and package it in a JAR file (the example
code that accompanies this topic comes with build instructions for how to do this). Then
we tell Pig about the JAR file with the
REGISTER
operator, which is given the local path
to the filename (and is not enclosed in quotes):