Database Reference
In-Depth Information
Interoperability
To demonstrate Avro's language interoperability, let's write a datafile using one language
(Python) and read it back with another (Java).
Python API
The program in Example 12-1 reads comma-separated strings from standard input and
writes them as StringPair records to an Avro datafile. Like in the Java code for writing
a datafile, we create a DatumWriter and a DataFileWriter object. Notice that we
have embedded the Avro schema in the code, although we could equally well have read it
from a file.
Python represents Avro records as dictionaries; each line that is read from standard in is
turned into a dict object and appended to the DataFileWriter .
Example 12-1. A Python program for writing Avro record pairs to a datafile
import os
import string
import sys
from avro import schema
from avro import io
from avro import datafile
if __name__ == '__main__' :
if len ( sys . argv ) != 2 :
sys . exit ( 'Usage: %s <data_file>' % sys . argv [ 0 ])
avro_file = sys . argv [ 1 ]
writer = open ( avro_file , 'wb' )
datum_writer = io . DatumWriter ()
schema_object = schema . parse ( " \
{ " type ": " record ",
"name" : "StringPair" ,
"doc" : "A pair of strings." ,
"fields" : [
{ "name" : "left" , "type" : "string" },
{ "name" : "right" , "type" : "string" }
]
} ")
dfw = datafile . DataFileWriter ( writer , datum_writer , schema_object )
for line in sys . stdin . readlines ():
( left , right ) = string . split ( line . strip (), ',' )
Search WWH ::




Custom Search