Hive - Hadoop: The Definitive Guide

Database Reference

In-Depth Information

Next, we can use the explode UDTF to transform this table. This function emits a row

for each entry in the array, so in this case the type of the output column y is STRING . The

result is that the table is flattened into five rows:

hive> SELECT explode(x) AS y FROM arrays;

a

b

c

d

e

SELECT statements using UDTFs have some restrictions (e.g., they cannot retrieve addi-

tional column expressions), which make them less useful in practice. For this reason, Hive

supports LATERAL VIEW queries, which are more powerful. LATERAL VIEW queries

are not covered here, but you may find out more about them in the Hive wiki .

Writing a UDF

To illustrate the process of writing and using a UDF, we'll write a simple UDF to trim

characters from the ends of strings. Hive already has a built-in function called trim , so

we'll call ours strip . The code for the Strip Java class is shown in Example 17-2 .

Example 17-2. A UDF for stripping characters from the ends of strings

package com . hadoopbook . hive ;

import org.apache.commons.lang.StringUtils ;

import org.apache.hadoop.hive.ql.exec.UDF ;

import org.apache.hadoop.io.Text ;

public class Strip extends UDF {

private Text result = new Text ();

public Text evaluate ( Text str ) {

if ( str == null ) {

return null ;

}

result . set ( StringUtils . strip ( str . toString ()));

return result ;

} public Text evaluate ( Text str , String stripChars ) {

if ( str == null ) {

return null ;

}

result . set ( StringUtils . strip ( str . toString (), stripChars ));

return result ;

Search WWH ::

Custom Search

Home