Java Reference
In-Depth Information
useful in detecting version differences in classes, but the story of the
serialVersionUID
is a
cautionary tale that may be of use to tell.
Object serialization allows us to take an object and convert it into a form that can be used to
reconstruct a copy of that object, either at a different place (when it is used, for example, in
RMI) or at a different time (when it is used, for example, in an
ObjectOutputStream
). But
any time that you try to reconstruct a copy of an object at a different place or time, you face
the problem of what to do if the class of the object has changed. Remember that objects are
not just a collection of data, but are also the code that allows manipulation of that data. Since
we aren't storing the code along with the object state in the serialized form, there is always
the chance that when we reconstruct the object, the class that defined the data or the methods
that manipulated the data have changed.
Java itself is defined as a name-equivalence language. This means that if two entities in the
language have the same name, then those two things are the same within the Java framework.
This is especially true for classes, which is why it is important to use packages (to assure that
you have a different fully qualified name for your different classes). But being programmers,
same name in two different VMs identifies a class that had changed over time (or space), and
that this could lead to very ugly problems.
The original idea was to introduce the
serialVersionUID
as a mechanism to allow the sys-
tem to compare the class of an object that had been serialized with the class of that name avail-
able to the JVM in which the object was being deserialized. What the system did (and still
does if you don't shortcircuit it) was compare a fingerprint of the class in the deserializing VM
with a fingerprint of the class from the original context of the serialized object, using a form
of structural equivalence. In effect, the system wanted to make sure that all of the methods (in-
cluding name, parameters, and return values) and all of the fields that defined the two classes
were the same. The
serialVersionUID
was that fingerprint. Such a comparison shouldn't
consider a change in order to make a difference, so it determined the
serialVersionUID
by
taking a canonical ordering of the fields and methods and then taking a secure hash of those
strings. This gave a single value that the system could use to determine changes over time
in classes that were named the same. Attempting to deserialize an object that has a different
serialVersionUID
than the class with the same name in the deserializing process will result
in an
InvalidClassException
, which is also a subclass of
IOException
and therefore not
an explicit part of the signature of methods that deserialize objects.
This approach was well-intentioned, clever, and ultimately doomed. It worked as long as there
were no Java compilers that would do clever things to the names of class members. It worked