Database Reference
In-Depth Information
Schema Resolution
We can choose to use a different schema for reading the data back (the reader's schema )
from the one we used to write it (the writer's schema ). This is a powerful tool because it
enables schema evolution. To illustrate, consider a new schema for string pairs with an ad-
ded description field:
{
"type" : "record" ,
"name" : "StringPair" ,
"doc" : "A pair of strings with an added field." ,
"fields" : [
{ "name" : "left" , "type" : "string" },
{ "name" : "right" , "type" : "string" },
{ "name" : "description" , "type" : "string" , "default" : "" }
]
}
We can use this schema to read the data we serialized earlier because, crucially, we have
given the description field a default value (the empty string), [ 82 ] which Avro will use
when there is no such field defined in the records it is reading. Had we omitted the de-
fault attribute, we would get an error when trying to read the old data.
NOTE
To make the default value null rather than the empty string, we would instead define the descrip-
tion field using a union with the null Avro type:
{ "name" : "description" , "type" : [ "null" , "string" ], "default" : null }
When the reader's schema is different from the writer's, we use the constructor for Gen-
ericDatumReader that takes two schema objects, the writer's and the reader's, in that
order:
DatumReader < GenericRecord > reader =
new GenericDatumReader < GenericRecord >( schema , newSchema );
Decoder decoder =
DecoderFactory . get (). binaryDecoder ( out . toByteArray (),
null );
GenericRecord result = reader . read ( null , decoder );
assertThat ( result . get ( "left" ). toString (), is ( "L" ));
assertThat ( result . get ( "right" ). toString (), is ( "R" ));
assertThat ( result . get ( "description" ). toString (), is ( "" ));
Search WWH ::




Custom Search