Database Reference
In-Depth Information
In-Memory Serialization and Deserialization
Avro provides APIs for serialization and deserialization that are useful when you want to
integrate Avro with an existing system, such as a messaging system where the framing
format is already defined. In other cases, consider using Avro's datafile format.
Let's write a Java program to read and write Avro data from and to streams. We'll start with
a simple Avro schema for representing a pair of strings as a record:
{
"type" : "record" ,
"name" : "StringPair" ,
"doc" : "A pair of strings." ,
"fields" : [
{ "name" : "left" , "type" : "string" },
{ "name" : "right" , "type" : "string" }
]
}
If this schema is saved in a file on the classpath called StringPair.avsc ( .avsc is the conven-
tional extension for an Avro schema), we can load it using the following two lines of code:
Schema . Parser parser = new Schema . Parser ();
Schema schema = parser . parse (
getClass (). getResourceAsStream ( "StringPair.avsc" ));
We can create an instance of an Avro record using the Generic API as follows:
GenericRecord datum = new GenericData . Record ( schema );
datum . put ( "left" , "L" );
datum . put ( "right" , "R" );
Next, we serialize the record to an output stream:
ByteArrayOutputStream out = new ByteArrayOutputStream ();
DatumWriter < GenericRecord > writer =
new GenericDatumWriter < GenericRecord >( schema );
Encoder encoder = EncoderFactory . get (). binaryEncoder ( out , null );
writer . write ( datum , encoder );
encoder . flush ();
out . close ();
There are two important objects here: the DatumWriter and the Encoder . A
DatumWriter translates data objects into the types understood by an Encoder , which
the latter writes to the output stream. Here we are using a GenericDatumWriter ,
Search WWH ::




Custom Search