Database Reference
In-Depth Information
Apache Avro
During the processing of data in a distributed manner, several objects are built and trans-
ferred between the nodes of a cluster. These objects are transferred using the process of
serialization. Serialization is the process of transforming an object in the memory to a
stream of bytes. This stream of bytes is then transferred over the wire to the destination
node. The destination node reads the stream of bytes and reconstructs the object. This re-
construction is called deserialization . Another use of a serialized object is to write it to a
persistent store (file). Apache Avro is a serialization-deserialization framework used in
Apache Hadoop. In Hadoop, Avro is used for interprocess communication between the dif-
ferent nodes in a cluster.
Search WWH ::




Custom Search