Database Reference
In-Depth Information
Avro in Other Languages
For languages and frameworks other than Java, there are a few choices for working with
Avro data.
AvroAsTextInputFormat is designed to allow Hadoop Streaming programs to read
Avro datafiles. Each datum in the file is converted to a string, which is the JSON represent-
ation of the datum, or just to the raw bytes if the type is Avro bytes . Going the other way,
you can specify AvroTextOutputFormat as the output format of a Streaming job to
create Avro datafiles with a bytes schema, where each datum is the tab-delimited key-
value pair written from the Streaming output. Both of these classes can be found in the
org.apache.avro.mapred package.
It's also worth considering other frameworks like Pig, Hive, Crunch, and Spark for doing
Avro processing, since they can all read and write Avro datafiles by specifying the appro-
priate storage formats. See the relevant chapters in this topic for details.
[ 79 ] Named after the British aircraft manufacturer from the 20th century.
[ 80 ] Avro also performs favorably compared to other serialization libraries, as the benchmarks demonstrate.
[ 81 ] Avro can be downloaded in both source and binary forms . Get usage instructions for the Avro tools by
typing java -jar avro-tools-*.jar .
[ 82 ] Default values for fields are encoded using JSON. See the Avro specification for a description of this en-
coding for each data type.
[ 83 ] A useful consequence of this property is that you can compute an Avro datum's hash code from either the
object or the binary representation (the latter by using the static hashCode() method on BinaryData )
and get the same result in both cases.
[ 84 ] For an example that uses the Specific mapping with generated classes, see the AvroSpe-
cificMaxTemperature class in the example code.
[ 85 ] If we had used the identity mapper and reducer here, the program would sort and remove duplicate keys
at the same time. We encounter this idea of duplicating information from the key in the value object again in
Secondary Sort .
Search WWH ::




Custom Search