Database Reference
In-Depth Information
24 Date parsedate = incommingDateFormat.parse( s.toString() );
25
26 to_value.set( convertedDateFormat.format(parsedate) );
27
28 }
29 catch (Exception e)
30 {
31 to_value = new Text(s);
32 }
33 }
34 return to_value;
35 }
36 }
The package name is defined at line 1, while import statements to import Hive, Hadoop, and Java functionality
exist between lines 3 and 6.
1 package nz.co.semtech-solutions.hive.udf;
3 import org.apache.hadoop.hive.ql.exec.UDF;
4 import org.apache.hadoop.io.Text;
5 import java.text.SimpleDateFormat;
6 import java.util.Date;
The class DateConv that is the UDF function name is defined at line 8; it extends an existing class UDF .
8 class DateConv extends UDF
At line 11, the public class evaulate is defined, which takes a Text parameter and returns a Text value:
11 public Text evaluate(Text s)
Finally, the main functionality of the UDF occurs between lines 21 and 26 in the try/catch section of the code. The
input date string is converted from the format dd-MM-yyyy to the format yyyy-MM-dd. (This is a somewhat contrived
example that takes only a single date format, but it gives an idea of what can be achieved with Hive UDFs.)
21 SimpleDateFormat incommingDateFormat = new
SimpleDateFormat("dd/MM/yyyy");
22 SimpleDateFormat convertedDateFormat = new
SimpleDateFormat("yyyy-MM-dd");
23
24 Date parsedate = incommingDateFormat.parse( s.toString() );
25
26 to_value.set( convertedDateFormat.format(parsedate) );
Having created the Java file that will form the new UDF function, I move back to the top of the directory structure
by using the Linux cd command and invoke the sbt command to compile the code:
[hadoop@hc2nn udf]$ cd /home/hadoop/hive/udf/
[hadoop@hc2nn udf]$ sbt
[info] Set current project to DateConv (in build file:/home/hadoop/hive/udf/)
 
Search WWH ::




Custom Search