Database Reference
In-Depth Information
Next, we can use the
explode
UDTF to transform this table. This function emits a row
for each entry in the array, so in this case the type of the output column
y
is
STRING
. The
result is that the table is flattened into five rows:
hive>
SELECT explode(x) AS y FROM arrays;
a
b
c
d
e
SELECT
statements using UDTFs have some restrictions (e.g., they cannot retrieve addi-
tional column expressions), which make them less useful in practice. For this reason, Hive
supports
LATERAL VIEW
queries, which are more powerful.
LATERAL VIEW
queries
Writing a UDF
To illustrate the process of writing and using a UDF, we'll write a simple UDF to trim
characters from the ends of strings. Hive already has a built-in function called
trim
, so
Example 17-2. A UDF for stripping characters from the ends of strings
package
com
.
hadoopbook
.
hive
;
import
org.apache.commons.lang.StringUtils
;
import
org.apache.hadoop.hive.ql.exec.UDF
;
import
org.apache.hadoop.io.Text
;
public class
Strip
extends
UDF
{
private
Text result
=
new
Text
();
public
Text
evaluate
(
Text str
) {
if
(
str
==
null
) {
return null
;
}
result
.
set
(
StringUtils
.
strip
(
str
.
toString
()));
return
result
;
}
public
Text evaluate
(
Text str
,
String stripChars
) {
if
(
str
==
null
) {
return null
;
}
result
.
set
(
StringUtils
.
strip
(
str
.
toString
(),
stripChars
));
return
result
;