Database Reference
In-Depth Information
(flight_date:chararray,airline_cd:int,airport_cd:chararray,
delay:int,dep_time:int);
Lower = FOREACH FlightData GENERATE lcase(airport_cd);
To create the UDF, you first add a reference to the
pig.jar
file. After
doing so, you need to create a class that extends the
EvalFunc
class. The
EvalFunc
is the base class for all eval functions. The
import
statements at
the top of the file indicate the various classes you are going to use from the
referenced jar files:
import java.io.IOException;
import org.apache.pig.EvalFunc;
import org.apache.pig.data.Tuple;
public class Lower extends EvalFunc<String>
{
}
The next step is to add an
exec
function that implements the processing. It
has an input parameter of a tuple and an output of a string:
public String exec(Tuple arg0) throws IOException
{
if (arg0 == null || arg0.size() == 0)
return null;
try
{
String str = (String)arg0.get(0);
return str.toLowerCase();
}
catch(Exception e)
{
throw new
IOException("Caught exception processing input
row ", e);
}
}
The first part of the code checks the input tuple to make sure that it is valid
and then uses a
try-catch
block. The
try
block converts the string to