Working with objects (Hibernate)

 

You now have an understanding of how Hibernate and ORM solve the static aspects of the object/relational mismatch. With what you know so far, it’s possible to solve the structural mismatch problem, but an efficient solution to the problem requires something more. You must investigate strategies for runtime data access, because they’re crucial to the performance of your applications. You basically have learn how to control the state of objects.

This and the following topics cover the behavioral aspect of the object/relational mismatch. We consider these problems to be at least as important as the structural problems discussed in previous topics. In our experience, many developers are only really aware of the structural mismatch and rarely pay attention to the more dynamic behavioral aspects of the mismatch.

In this topic, we discuss the lifecycle of objects—how an object becomes persistent, and how it stops being considered persistent—and the method calls and other actions that trigger these transitions. The Hibernate persistence manager, the Session, is responsible for managing object state, so we discuss how to use this important API. The main Java Persistence interface in EJB 3.0 is called EntityMan-ager, and thanks to its close resemblance with Hibernate APIs, it will be easy to learn alongside. Of course, you can skip quickly through this material if you aren’t working with Java Persistence or EJB 3.0—we encourage you to read about both options and then decide what is better for your application.

Let’s start with persistent objects, their lifecycle, and the events which trigger a change of persistent state. Although some of the material may be formal, a solid understanding of the persistence lifecycle is essential.

The persistence lifecycle

Because Hibernate is a transparent persistence mechanism—classes are unaware of their own persistence capability—it’s possible to write application logic that is unaware whether the objects it operates on represent persistent state or temporary state that exists only in memory. The application shouldn’t necessarily need to care that an object is persistent when invoking its methods. You can, for example, invoke the calculateTotalPrice() business method on an instance of the Item class without having to consider persistence at all; e.g., in a unit test.

Any application with persistent state must interact with the persistence service whenever it needs to propagate state held in memory to the database (or vice versa). In other words, you have to call Hibernate (or the Java Persistence) interfaces to store and load objects.

When interacting with the persistence mechanism in that way, it’s necessary for the application to concern itself with the state and lifecycle of an object with respect to persistence. We refer to this as the persistence lifecycle: the states an object goes through during its life. We also use the term unit of work: a set of operations you consider one (usually atomic) group. Another piece of the puzzle is the persistence context provided by the persistence service. Think of the persistence context as a cache that remembers all the modifications and state changes you made to objects in a particular unit of work (this is somewhat simplified, but it’s a good starting point).

We now dissect all these terms: object and entity states, persistence contexts, and managed scope. You’re probably more accustomed to thinking about what statements you have to manage to get stuff in and out of the database (via JDBC and SQL). However, one of the key factors of your success with Hibernate (and Java Persistence) is your understanding of state management, so stick with us through this section.

Object states

Different ORM solutions use different terminology and define different states and state transitions for the persistence lifecycle. Moreover, the object states used internally may be different from those exposed to the client application. Hibernate defines only four states, hiding the complexity of its internal implementation from the client code.

The object states defined by Hibernate and their transitions in a state chart are shown in figure 9.1. You can also see the method calls to the persistence manager API that trigger transitions. This API in Hibernate is the Session. We discuss this chart in this topic; refer to it whenever you need an overview.

We’ve also included the states ofJava Persistence entity instances in figure 9.1. As you can see, they’re almost equivalent to Hibernate’s, and most methods of the Session have a counterpart on the EntityManager API (shown in italics). We say that Hibernate is a superset of the functionality provided by the subset standardized in Java Persistence.

Some methods are available on both APIs; for example, the Session has a per-sist() operation with the same semantics as the EntityManager’s counterpart. Others, like load() and getReference(), also share semantics, with a different method name.

During its life, an object can transition from a transient object to a persistent object to a detached object. Let’s explore the states and transitions in more detail.

Object states and their transitions as triggered by persistence manager operations

Figure 9.1 Object states and their transitions as triggered by persistence manager operations

Transient objects

Objects instantiated using the new operator aren’t immediately persistent. Their state is transient, which means they aren’t associated with any database table row and so their state is lost as soon as they’re no longer referenced by any other object. These objects have a lifespan that effectively ends at that time, and they become inaccessible and available for garbage collection. Java Persistence doesn’t include a term for this state; entity objects you just instantiated are new. We’ll continue to refer to them as transient to emphasize the potential for these instances to become managed by a persistence service.

Hibernate and Java Persistence consider all transient instances to be nontransactional; any modification of a transient instance isn’t known to a persistence context. This means that Hibernate doesn’t provide any roll-back functionality for transient objects.

Objects that are referenced only by other transient instances are, by default, also transient. For an instance to transition from transient to persistent state, to become managed, requires either a call to the persistence manager or the creation of a reference from an already persistent instance.

Persistent objects

A persistent instance is an entity instance with a database identity, as defined.  “Mapping entities with identity.” That means a persistent and managed instance has a primary key value set as its database identifier. (There are some variations to when this identifier is assigned to a persistent instance.)

Persistent instances may be objects instantiated by the application and then made persistent by calling one of the methods on the persistence manager. They may even be objects that became persistent when a reference was created from another persistent object that is already managed. Alternatively, a persistent instance may be an instance retrieved from the database by execution of a query, by an identifier lookup, or by navigating the object graph starting from another persistent instance.

Persistent instances are always associated with a persistence context. Hibernate caches them and can detect whether they have been modified by the application.

There is much more to be said about this state and how an instance is managed in a persistence context. We’ll get back to this later in this topic.

Removed objects

You can delete an entity instance in several ways: For example, you can remove it with an explicit operation of the persistence manager. It may also become available for deletion if you remove all references to it, a feature available only in Hibernate or in Java Persistence with a Hibernate extension setting (orphan deletion for entities).

An object is in the removed state if it has been scheduled for deletion at the end of a unit of work, but it’s still managed by the persistence context until the unit of work completes. In other words, a removed object shouldn’t be reused because it will be deleted from the database as soon as the unit of work completes. You should also discard any references you may hold to it in the application (of course, after you finish working with it—for example, after you’ve rendered the removal-confirmation screen your users see).

Detached objects

To understand detached objects, you need to consider a typical transition of an instance: First it’s transient, because it just has been created in the application. Now you make it persistent by calling an operation on the persistence manager. All of this happens in a single unit of work, and the persistence context for this unit of work is synchronized with the database at some point (when an SQL INSERT occurs).

The unit of work is now completed, and the persistence context is closed. But the application still has a handle: a reference to the instance that was saved. As long as the persistence context is active, the state of this instance is persistent. At the end of a unit of work, the persistence context closes. What is the state of the object you’re holding a reference to now, and what can you do with it?

We refer to these objects as detached, indicating that their state is no longer guaranteed to be synchronized with database state; they’re no longer attached to a persistence context. They still contain persistent data (which may soon be stale). You can continue working with a detached object and modify it. However, at some point you probably want to make those changes persistent—in other words, bring the detached instance back into persistent state.

Hibernate offers two operations, reattachment and merging, to deal with this situation. Java Persistence only standardizes merging. These features have a deep impact on how multitiered applications may be designed. The ability to return objects from one persistence context to the presentation layer and later reuse them in a new persistence context is a main selling point of Hibernate and Java Persistence. It enables you to create long units of work that span user think-time. We call this kind of long-running unit of work a conversation. We’ll get back to detached objects and conversations soon.

You should now have a basic understanding of object states and how transitions occur. Our next topic is the persistence context and the management of objects it provides.

The persistence context

You may consider the persistence context to be a cache of managed entity instances. The persistence context isn’t something you see in your application; it isn’t an API you can call. In a Hibernate application, we say that one Session has one internal persistence context. In a Java Persistence application, an EntityManager has a persistence context. All entities in persistent state and managed in a unit of work are cached in this context. We walk through the Session and EntityManager APIs later in this topic. Now you need to know what this (internal) persistence context is buying you.

The persistence context is useful for several reasons:

■ Hibernate can do automatic dirty checking and transactional write-behind.

■ Hibernate can use the persistence context as a first-level cache.

■ Hibernate can guarantee a scope of Java object identity.

■ Hibernate can extend the persistence context to span a whole conversation.

All these points are also valid for Java Persistence providers. Let’s look at each feature.

Automatic dirty checking

Persistent instances are managed in a persistence context—their state is synchronized with the database at the end of the unit of work. When a unit of work completes, state held in memory is propagated to the database by the execution of SQL INSERT, UPDATE, and DELETE statements (DML). This procedure may also occur at other times. For example, Hibernate may synchronize with the database before execution of a query. This ensures that queries are aware of changes made earlier during the unit of work.

Hibernate doesn’t update the database row of every single persistent object in memory at the end of the unit of work. ORM software must have a strategy for detecting which persistent objects have been modified by the application. We call this automatic dirty checking. An object with modifications that have not yet been propagated to the database is considered dirty. Again, this state isn’t visible to the application. With transparent transaction-level write-behind, Hibernate propagates state changes to the database as late as possible but hides this detail from the application. By executing DML as late as possible (toward the end of the database transaction), Hibernate tries to keep lock-times in the database as short as possible. (DML usually creates locks in the database that are held until the transaction completes.)

Hibernate is able to detect exactly which properties have been modified so that it’s possible to include only the columns that need updating in the SQL UPDATE statement. This may bring some performance gains. However, it’s usually not a significant difference and, in theory, could harm performance in some environments. By default, Hibernate includes all columns of a mapped table in the SQL UPDATE statement (hence, Hibernate can generate this basic SQL at startup, not at runtime). If you want to update only modified columns, you can enable dynamic SQL generation by setting dynamic-update=”true” in a class mapping. The same mechanism is implemented for insertion of new records, and you can enable runtime generation of INSERT statements with dynamic-insert=”true”. We recommend you consider this setting when you have an extraordinarily large number of columns in a table (say, more than 50); at some point, the overhead network traffic for unchanged fields will be noticeable.

In rare cases, you may also want to supply your own dirty checking algorithm to Hibernate. By default, Hibernate compares an old snapshot of an object with the snapshot at synchronization time, and it detects any modifications that require an update of the database state. You can implement your own routine by supplying a custom findDirty() method with an org.hibernate.Interceptor for a Session. We’ll show you an implementation of an interceptor later in the topic.

We’ll also get back to the synchronization process (known as flushing) and when it occurs later in this topic.

The persistence context cache

A persistence context is a cache of persistent entity instances. This means it remembers all persistent entity instances you’ve handled in a particular unit of work. Automatic dirty checking is one of the benefits of this caching. Another benefit is repeatable read for entities and the performance advantage of a unit of work-scoped cache.

For example, if Hibernate is told to load an object by primary key (a lookup by identifier), it can first check the persistence context for the current unit of work. If the entity is found there, no database hit occurs—this is a repeatable read for an application. The same is true if a query is executed through one of the Hibernate (or Java Persistence) interfaces. Hibernate reads the result set of the query and marshals entity objects that are then returned to the application. During this process, Hibernate interacts with the current persistence context. It tries to resolve every entity instance in this cache (by identifier); only if the instance can’t be found in the current persistence context does Hibernate read the rest of the data from the result set.

The persistence context cache offers significant performance benefits and improves the isolation guarantees in a unit of work (you get repeatable read of entity instances for free). Because this cache only has the scope of a unit of work, it has no real disadvantages, such as lock management for concurrent access—a unit of work is processed in a single thread at a time.

The persistence context cache sometimes helps avoid unnecessary database traffic; but, more important, it ensures that:

■ The persistence layer isn’t vulnerable to stack overflows in the case of circular references in a graph of objects.

■ There can never be conflicting representations of the same database row at the end of a unit of work. In the persistence context, at most a single object represents any database row. All changes made to that object may be safely written to the database.

■ Likewise, changes made in a particular persistence context are always immediately visible to all other code executed inside that persistence context and its unit of work (the repeatable read for entities guarantee).

You don’t have to do anything special to enable the persistence context cache. It’s always on and, for the reasons shown, can’t be turned off.

Later in this topic, we’ll show you how objects are added to this cache (basically, whenever they become persistent) and how you can manage this cache (by detaching objects manually from the persistence context, or by clearing the persistence context).

The last two items on our list of benefits of a persistence context, the guaranteed scope of identity and the possibility to extend the persistence context to span a conversation, are closely related concepts. To understand them, you need to take a step back and consider objects in detached state from a different perspective.

Object identity and equality

A basic Hibernate client/server application may be designed with server-side units of work that span a single client request. When a request from the application user requires data access, a new unit of work is started. The unit of work ends when processing is complete and the response for the user is ready. This is also called the session-per-request strategy (you can replace the word session with persistence context whenever you read something like this, but it doesn’t roll off the tongue as well).

We already mentioned that Hibernate can support an implementation of a possibly long-running unit of work, called a conversation. We introduce the concept of conversations in the following sections as well as the fundamentals of object identity and when objects are considered equal—which can impact how you think about and design conversations.

Why is the concept of a conversation useful?

Introducing conversations

For example, in web applications, you don’t usually maintain a database transaction across a user interaction. Users take a long time to think about modifications, but, for scalability reasons, you must keep database transactions short and release database resources as soon as possible. You’ll likely face this issue whenever you need to guide the user through several screens to complete a unit of work (from the user’s perspective)—for example, to fill an online form. In this common scenario, it’s extremely useful to have the support of the persistence service, so you can implement such a conversation with a minimum of coding and best scalability.

Two strategies are available to implement a conversation in a Hibernate or Java Persistence application: with detached objects or by extending a persistence context. Both have strength and weaknesses.

Conversation implementation with detached object state

Figure 9.2 Conversation implementation with detached object state

The detached object state and the already mentioned features of reattachment or merging are ways to implement a conversation. Objects are held in detached state during user think-time, and any modification of these objects is made persistent manually through reattachment or merging. This strategy is also called session-per-request-with-detached-objects. You can see a graphical illustration of this conversation pattern in figure 9.2.

A persistence context only spans the processing of a particular request, and the application manually reattaches and merges (and sometimes detaches) entity instances during the conversation.

The alternative approach doesn’t require manual reattachment or merging: With the session-per-conversation pattern, you extend a persistence context to span the whole unit of work (see figure 9.3).

First we have a closer look at detached objects and the problem of identity you’ll face when you implement a conversation with this strategy.

Conversation implementation with an extended persistence context

Figure 9.3 Conversation implementation with an extended persistence context

The scope of object identity

As application developers, we identify an object using Java object identity (a==b). If an object changes state, is the Java identity guaranteed to be the same in the new state? In a layered application, that may not be the case.

In order to explore this, it’s extremely important to understand the relationship between Java identity, a==b, and database identity, x.getId().equals( y.getId() ). Sometimes they’re equivalent; sometimes they aren’t. We refer to the conditions under which Java identity is equivalent to database identity as the scope of object identity.

For this scope, there are three common choices:

■ A primitive persistence layer with no identity scope makes no guarantees that if a row is accessed twice the same Java object instance will be returned to the application. This becomes problematic if the application modifies two different instances that both represent the same row in a single unit of work. (How should we decide which state should be propagated to the database?)

■ A persistence layer using persistence context-scoped identity guarantees that, in the scope of a single persistence context, only one object instance represents a particular database row. This avoids the previous problem and also allows for some caching at the context level.

■ Process-scoped identity goes one step further and guarantees that only one object instance represents the row in the whole process (JVM).

For a typical web or enterprise application, persistence context-scoped identity is preferred. Process-scoped identity does offer some potential advantages in terms of cache utilization and the programming model for reuse of instances across multiple units of work. However, in a pervasively multithreaded application, the cost of always synchronizing shared access to persistent objects in the global identity map is too high a price to pay. It’s simpler, and more scalable, to have each thread work with a distinct set of persistent instances in each persistence context.

We would say that Hibernate implements persistence context-scoped identity. So, by nature, Hibernate is best suited for highly concurrent data access in multiuser applications. However, we already mentioned some issues you’ll face when objects aren’t associated with a persistence context. Let’s discuss this with an example.

The Hibernate identity scope is the scope of a persistence context. Let’s see how this works in code with Hibernate APIs—the Java Persistence code is the equivalent with EntityManager instead of Session. Even though we haven’t shown you much about these interfaces, the following examples are simple, and you should have no problems understanding the methods we call on the Session.

If you request two objects using the same database identifier value in the same Session, the result is two references to the same in-memory instance. Listing 9.1 demonstrates this with several get() operations in two Sessions.

Listing 9.1 The guaranteed scope of object identity in Hibernate

tmp9D-39_thumb[1]

Object references a and b have not only the same database identity, but also the same Java identity, because they’re obtained in the same Session. They reference the same persistent instance known to the persistence context for that unit of work. Once you’re outside this boundary, however, Hibernate doesn’t guarantee Java identity, so a and c aren’t identical. Of course, a test for database identity, a.getId().equals( c.getId() ), will still return true.

If you work with objects in detached state, you’re dealing with objects that are living outside of a guaranteed scope of object identity.

The identity of detached objects

If an object reference leaves the scope of guaranteed identity, we call it a reference to a detached object. In listing 9.1, all three object references, a, b, and c, are equal if we only consider database identity—their primary key value. However, they aren’t identical in-memory object instances. This can lead to problems if you treat them as equal in detached state. For example, consider the following extension of the code, after session2 has ended:

tmp9D-40_thumb

All three references have been added to a Set. All are references to detached objects. Now, if you check the size of the collection, the number of elements, what result do you expect?

First you have to realize the contract of a Java Set: No duplicate elements are allowed in such a collection. Duplicates are detected by the Set; whenever you add an object, its equals() method is called automatically. The added object is checked against all other elements already in the collection. If equals() returns true for any object already in the collection, the addition doesn’t occur.

If you know the implementation of equals() for the objects, you can find out the number of elements you can expect in the Set. By default, all Java classes inherit the equals() method of java.lang.Object. This implementation uses a double-equals (==) comparison; it checks whether two references refer to the same in-memory instance on the Java heap.

You may guess that the number of elements in the collection is two. After all, a and b are references to the same in-memory instance; they have been loaded in the same persistence context. Reference c is obtained in a second Session; it refers to a different instance on the heap. You have three references to two instances. However, you know this only because you’ve seen the code that loaded the objects. In a real application, you may not know that a and b are loaded in the same Session and c in another.

Furthermore, you obviously expect that the collection has exactly one element, because a, b, and c represent the same database row.

Whenever you work with objects in detached state, and especially if you test them for equality (usually in hash-based collections), you need to supply your own implementation of the equals() and hashCode() methods for your persistent classes.

Understanding equals() and hashCode()

Before we show you how to implement your own equality routine. we have to bring two important points to your attention. First, in our experience, many Java developers never had to override the equals() and hashCode() methods before using Hibernate (or Java Persistence). Traditionally, Java developers seem to be unaware of the intricate details of such an implementation. The longest discussion threads on the public Hibernate forum are about this equality problem, and the “blame” is often put on Hibernate. You should be aware of the fundamental issue: Every object-oriented programming language with hash-based collections requires a custom equality routine if the default contract doesn’t offer the desired semantics. The detached object state in a Hibernate application exposes you to this problem, maybe for the first time.

On the other hand, you may not have to override equals() and hashCode(). The identity scope guarantee provided by Hibernate is sufficient if you never compare detached instances—that is, if you never put detached instances into the same Set. You may decide to design an application that doesn’t use detached objects. You can apply an extended persistence context strategy for your conversation implementation and eliminate the detached state from your application completely. This strategy also extends the scope of guaranteed object identity to span the whole conversation. (Note that you still need the discipline to not compare detached instances obtained in two conversations!)

Let’s assume that you want to use detached objects and that you have to test them for equality with your own routine. You can implement equals() and hash-Code() several ways. Keep in mind that when you override equals(), you always need to also override hashCode() so the two methods are consistent. If two objects are equal, they must have the same hashcode.

A clever approach is to implement equals() to compare just the database identifier property (often a surrogate primary key) value:

tmp9D-41_thumbtmp9D-42_thumb

Notice how this equals() method falls back to Java identity for transient instances (if id==null) that don’t have a database identifier value assigned yet. This is reasonable, because they can’t possibly be equal to a detached instance, which has an identifier value.

Unfortunately, this solution has one huge problem: Identifier values aren’t assigned by Hibernate until an object becomes persistent. If a transient object is added to a Set before being saved, its hash value may change while it’s contained by the Set, contrary to the contract of java.util.Set. In particular, this problem makes cascade save (discussed later in the topic) useless for sets. We strongly discourage this solution (database identifier equality).

A better way is to include all persistent properties of the persistent class, apart from any database identifier property, in the equals() comparison. This is how most people perceive the meaning of equals(); we call it by value equality.

When we say all properties, we don’t mean to include collections. Collection state is associated with a different table, so it seems wrong to include it. More important, you don’t want to force the entire object graph to be retrieved just to perform equals(). In the case of User, this means you shouldn’t include the boughtItems collection in the comparison. This is the implementation you can write:

tmp9D-43_thumb[1]

However, there are again two problems with this approach. First, instances from different Sessions are no longer equal if one is modified (for example, if the user changes the password). Second, instances with different database identity (instances that represent different rows of the database table) can be considered equal unless some combination of properties is guaranteed to be unique (the database columns have a unique constraint). In the case of user, there is a unique property: username.

This leads us to the preferred (and semantically correct) implementation of an equality check. You need a business key.

Implementing equality with a business key

To get to the solution that we recommend, you need to understand the notion of a business key. A business key is a property, or some combination of properties, that is unique for each instance with the same database identity. Essentially, it’s the natural key that you would use if you weren’t using a surrogate primary key instead. Unlike a natural primary key, it isn’t an absolute requirement that the business key never changes—as long as it changes rarely, that’s enough.

We argue that essentially every entity class should have some business key, even if it includes all properties of the class (this would be appropriate for some immutable classes). The business key is what the user thinks of as uniquely identifying a particular record, whereas the surrogate key is what the application and database use.

Business key equality means that the equals() method compares only the properties that form the business key. This is a perfect solution that avoids all the problems described earlier. The only downside is that it requires extra thought to identify the correct business key in the first place. This effort is required anyway; it’s important to identify any unique keys if your database must ensure data integrity via constraint checking.

For the User class, username is a great candidate business key. It’s never null, it’s unique with a database constraint, and it changes rarely, if ever:

tmp9D-44_thumbtmp9D-45_thumb

For some other classes, the business key may be more complex, consisting of a combination of properties. Here are some hints that should help you identify a business key in your classes:

■ Consider what attributes users of your application will refer to when they have to identify an object (in the real world). How do users tell the difference between one object and another if they’re displayed on the screen? This is probably the business key you’re looking for.

■ Every attribute that is immutable is probably a good candidate for the business key. Mutable attributes may be good candidates, if they’re updated rarely or if you can control the situation when they’re updated.

■ Every attribute that has a UNIQUE database constraint is a good candidate for the business key. Remember that the precision of the business key has to be good enough to avoid overlaps.

■ Any date or time-based attribute, such as the creation time of the record, is usually a good component of a business key. However, the accuracy of Sys-tem.currentTimeMillis() depends on the virtual machine and operating system. Our recommended safety buffer is 50 milliseconds, which may not be accurate enough if the time-based property is the single attribute of a business key.

■ You can use database identifiers as part of the business key. This seems to contradict our previous statements, but we aren’t talking about the database identifier of the given class. You may be able to use the database identifier of an associated object. For example, a candidate business key for the Bid class is the identifier of the Item it was made for together with the bid amount. You may even have a unique constraint that represents this composite business key in the database schema. You can use the identifier value of the associated Item because it never changes during the lifecycle of a Bid—setting an already persistent Item is required by the Bid constructor.

If you follow our advice, you shouldn’t have much difficulty finding a good business key for all your business classes. If you have a difficult case, try to solve it without considering Hibernate—after all, it’s purely an object-oriented problem. Notice that it’s almost never correct to override equals() on a subclass and include another property in the comparison. It’s a little tricky to satisfy the requirements that equality be both symmetric and transitive in this case; and, more important, the business key may not correspond to any well-defined candidate natural key in the database (subclass properties may be mapped to a different table).

You may have also noticed that the equals() and hashCode() methods always access the properties of the “other” object via the getter methods. This is extremely important, because the object instance passed as other may be a proxy object, not the actual instance that holds the persistent state. To initialize this proxy to get the property value, you need to access it with a getter method. This is one point where Hibernate isn’t completely transparent. However, it’s a good practice to use getter methods instead of direct instance variable access anyway.

Let’s switch perspective now and consider an implementation strategy for conversations that doesn’t require detached objects and doesn’t expose you to any of the problems of detached object equality. If the identity scope issues you’ll possibly be exposed to when you work with detached objects seem too much of a burden, the second conversation-implementation strategy may be what you’re looking for. Hibernate and Java Persistence support the implementation of conversations with an extended persistence context: the session-per-conversation strategy.

Extending a persistence context

A particular conversation reuses the same persistence context for all interactions. All request processing during a conversation is managed by the same persistence context. The persistence context isn’t closed after a request from the user has been processed. It’s disconnected from the database and held in this state during user think-time. When the user continues in the conversation, the persistence context is reconnected to the database, and the next request can be processed. At the end of the conversation, the persistence context is synchronized with the database and closed. The next conversation starts with a fresh persistence context and doesn’t reuse any entity instances from the previous conversation; the pattern is repeated.

Note that this eliminates the detached object state! All instances are either transient (not known to a persistence context) or persistent (attached to a particular persistence context). This also eliminates the need for manual reattachment or merging of object state between contexts, which is one of the advantages of this strategy. (You still may have detached objects between conversations, but we consider this a special case that you should try to avoid.)

In Hibernate terms, this strategy uses a single Session for the duration of the conversation. Java Persistence has built-in support for extended persistence contexts and can even automatically store the disconnected context for you (in a stateful EJB session bean) between requests.

We’ll get back to conversations later in the topic and show you all the details about the two implementation strategies. You don’t have to choose one right now, but you should be aware of the consequences these strategies have on object state and object identity, and you should understand the necessary transitions in each case.

We now explore the persistence manager APIs and how you make the theory behind object states work in practice.

The Hibernate interfaces

Any transparent persistence tool includes a persistence manager API. This persistence manager usually provides services for the following:

■ Basic CRUD (create, retrieve, update, delete) operations

■ Query execution

■ Control of transactions

■ Management of the persistence context

The persistence manager may be exposed by several different interfaces. In the case of Hibernate, these are Session, Query, Criteria, and Transaction. Under the covers, the implementations of these interfaces are coupled tightly together.

In Java Persistence, the main interface you interact with is the EntityManager; it has the same role as the Hibernate Session. Other Java Persistence interfaces are Query and EntityTransaction (you can probably guess what their counterpart in native Hibernate is).

We’ll now show you how to load and store objects with Hibernate and Java Persistence. Sometimes both have exactly the same semantics and API, and even the method names are the same. It’s therefore much more important to keep your eyes open for little differences. To make this part of the topic easier to understand, we decided to use a different strategy than usual and explain Hibernate first and then Java Persistence.

Let’s start with Hibernate, assuming that you write an application that relies on the native API.

Storing and loading objects

In a Hibernate application, you store and load objects by essentially changing their state. You do this in units of work. A single unit of work is a set of operations considered an atomic group. If you’re guessing now that this is closely related to transactions, you’re right. But, it isn’t necessarily the same thing. We have to approach this step by step; for now, consider a unit of work a particular sequence of state changes to your objects that you’d group together. First you have to begin a unit of work.

Beginning a unit of work

At the beginning of a unit of work, an application obtains an instance of Session from the application’s SessionFactory:

tmp9D-46_thumb

At this point, a new persistence context is also initialized for you, and it will manage all the objects you work with in that Session. The application may have multiple SessionFactorys if it accesses several databases. How the SessionFactory is created and how you get access to it in your application code depends on your deployment environment and configuration—you should have the simple Hiber-nateUtil startup helper class ready.

You should never create a new SessionFactory just to service a particular request. Creation of a SessionFactory is extremely expensive. On the other hand, Session creation is extremely inexpensive. The Session doesn’t even obtain a JDBC Connection until a connection is required.

The second line in the previous code begins a Transaction on another Hibernate interface. All operations you execute inside a unit of work occur inside a transaction, no matter if you read or write data. However, the Hibernate API is optional, and you may begin a transaction in any way you like—we’ll explore these options in the next topic. If you use the Hibernate Transaction API, your code works in all environments, so you’ll do this for all examples in the following sections.

After opening a new Session and persistence context, you use it to load and save objects.

Making an object persistent

The first thing you want to do with a Session is make a new transient object persistent with the save() method (listing 9.2).

Listing 9.2 Making a transient instance persistent

Making a transient instance persistent

A new transient object item is instantiated as usual B. Of course, you may also instantiate it after opening a Session; they aren’t related yet. A new Session is opened using the SessionFactory ©. You start a new transaction.

A call to save() © makes the transient instance of Item persistent. It’s now associated with the current Session and its persistence context.

The changes made to persistent objects have to be synchronized with the database at some point. This happens when you commit() the Hibernate Transaction ©. We say a flush occurs (you can also call flush() manually; more about this later). To synchronize the persistence context, Hibernate obtains a JDBC connection and issues a single SQL INSERT statement. Note that this isn’t always true for insertion: Hibernate guarantees that the item object has an assigned database identifier after it has been saved, so an earlier INSERT may be necessary, depending on the identifier generator you have enabled in your mapping. The save() operation also returns the database identifier of the persistent instance.

The Session can finally be closed ©, and the persistence context ends. The reference item is now a reference to an object in detached state.

You can see the same unit of work and how the object changes state in figure 9.4.

It’s better (but not required) to fully initialize the Item instance before managing it with a Session. The SQL INSERT statement contains the values that were held by the object at the point when save() was called. You can modify the object after calling save(), and your changes will be propagated to the database as an (additional) SQL UPDATE.

tmp9D-48_thumb[1] 

Everything between session.beginTransaction() and tx.commit() occurs in one transaction. For now, keep in mind that all database operations in transaction scope either completely succeed or completely fail. If one of the UPDATE or INSERT statements made during flushing on tx.commit() fails, all changes made to persistent objects in this transaction are rolled back at the database level. However, Hibernate doesn’t roll back in-memory changes to persistent objects. This is reasonable because a failure of a transaction is normally nonrecoverable, and you have to discard the failed Session immediately. We’ll discuss exception handling later in the next topic.

Retrieving a persistent object

The Session is also used to query the database and retrieve existing persistent objects. Hibernate is especially powerful in this area, as you’ll see later in the topic. Two special methods are provided for the simplest kind of query: retrieval by identifier. The get() and load() methods are demonstrated in listing 9.3.

Listing 9.3 Retrieval of a Item by identifier

Listing 9.3 Retrieval of a Item by identifier

You can see the same unit of work in figure 9.5.

The retrieved object item is in persistent state and as soon as the persistence context is closed, in detached state.

tmp9D-50_thumb[1]

The one difference between get() and load() is how they indicate that the instance could not be found. If no row with the given identifier value exists in the database, get() returns null. The load() method throws an ObjectNotFound-Exception. It’s your choice what error-handling you prefer.

More important, the load() method may return a proxy, a placeholder, without hitting the database. A consequence of this is that you may get an ObjectNotFoun-dException later, as soon as you try to access the returned placeholder and force its initialization (this is also called lazy loading; we discuss load optimization in later topics.) The load() method always tries to return a proxy, and only returns an initialized object instance if it’s already managed by the current persistence context. In the example shown earlier, no database hit occurs at all! The get() method on the other hand never returns a proxy, it always hits the database.

You may ask why this option is useful—after all, you retrieve an object to access it. It’s common to obtain a persistent instance to assign it as a reference to another instance. For example, imagine that you need the item only for a single purpose: to set an association with a Comment: aComment.setForAuction(item). If this is all you plan to do with the item, a proxy will do fine; there is no need to hit the database. In other words, when the Comment is saved, you need the foreign key value of an item inserted into the COMMENT table. The proxy of an Item provides just that: an identifier value wrapped in a placeholder that looks like the real thing.

Modifying a persistent object

Any persistent object returned by get(), load(), or any entity queried is already associated with the current Session and persistence context. It can be modified, and its state is synchronized with the database (see listing 9.4).

Figure 9.6 shows this unit of work and the object transitions.

Listing 9.4 Modifying a persistent instance

tmp9D-51_thumb[1]Modifying a persistent instance

Figure 9.6

Modifying a persistent instance

First, you retrieve the object from the database with the given identifier. You modify the object, and these modifications are propagated to the database during flush when tx.commit() is called. This mechanism is called automatic dirty checking —that means Hibernate tracks and saves the changes you make to an object in persistent state. As soon as you close the Session, the instance is considered detached.

Making a persistent object transient

You can easily make a persistent object transient, removing its persistent state from the database, with the delete() method (see listing 9.5).

Listing 9.5 Making a persistent object transient using delete()

Listing 9.5 Making a persistent object transient using delete()

Look at figure 9.7.

The item object is in removed state after you call delete(); you shouldn’t continue working with it, and, in most cases, you should make sure any reference to it in your application is removed. The SQL DELETE is executed only when the Session’s persistence context is synchronized with the database at the end of the unit of work. After the Session is closed, the item object is considered an ordinary transient instance. The transient instance is destroyed by the garbage collector if it’s no longer referenced by any other object. Both the in-memory object instance and the persistent database row will have been removed.

Making a persistent object transient

Figure 9.7 Making a persistent object transient

FAQ Do I have to load an object to delete it? Yes, an object has to be loaded into the persistence context; an instance has to be in persistent state to be removed (note that a proxy is good enough). The reason is simple: You may have Hibernate interceptors enabled, and the object must be passed through these interceptors to complete its lifecycle. If you delete rows in the database directly, the interceptor won’t run. Having said that, Hibernate (and Java Persistence) offer bulk operations that translate into direct SQL DELETE statements.

Hibernate can also roll back the identifier of any entity that has been deleted, if you enable the hibernate.use_identifier_rollback configuration option. In the previous example, Hibernate sets the database identifier property of the deleted item to null after deletion and flushing, if the option is enabled. It’s then a clean transient instance that you can reuse in a future unit of work.

Replicating objects

The operations on the Session we have shown you so far are all common; you need them in every Hibernate application. But Hibernate can help you with some special use cases—for example, when you need to retrieve objects from one database and store them in another. This is called replication of objects.

Replication takes detached objects loaded in one Session and makes them persistent in another Session. These Sessions are usually opened from two different SessionFactorys that have been configured with a mapping for the same persistent class. Here is an example:

tmp9D-55_thumb[1]

The ReplicationMode controls the details of the replication procedure:

■ ReplicationMode.IGNORE—Ignores the object when there is an existing database row with the same identifier in the target database.

■ ReplicationMode.OVERWRITE—Overwrites any existing database row with the same identifier in the target database.

■ ReplicationMode.EXCEPTION—Throws an exception if there is an existing database row with the same identifier in the target database.

ReplicationMode.LATEST_VERSION—Overwrites the row in the target database if its version is earlier than the version of the object, or ignores the object otherwise. Requires enabled Hibernate optimistic concurrency control.

You may need replication when you reconcile data entered into different databases, when you’re upgrading system configuration information during product upgrades (which often involves a migration to a new database instance), or when you need to roll back changes made during non-ACID transactions.

You now know the persistence lifecycle and the basic operations of the persistence manager. Using these together with the persistent class mappings we discussed in earlier topics, you may now create your own small Hibernate application. Map some simple entity classes and components, and then store and load objects in a stand-alone application. You don’t need a web container or application server: Write a main() method, and call the Session as we discussed in the previous section.

In the next sections, we cover the detached object state and the methods to reat-tach and merge detached objects between persistence contexts. This is the foundation knowledge you need to implement long units of work—conversations. We assume that you’re familiar with the scope of object identity as explained earlier in this topic.

Working with detached objects

Modifying the item after the Session is closed has no effect on its persistent representation in the database. As soon as the persistence context is closed, item becomes a detached instance.

If you want to save modifications you made to a detached object, you have to either reattach or merge it.

Reattaching a modified detached instance

A detached instance may be reattached to a new Session (and managed by this new persistence context) by calling update() on the detached object. In our experience, it may be easier for you to understand the following code if you rename the update() method in your mind to reattach() —however, there is a good reason it’s called updating.

The update() method forces an update to the persistent state of the object in the database, always scheduling an SQL UPDATE. See listing 9.6 for an example of detached object handling.

Listing 9.6 Updating a detached instance

Listing 9.6 Updating a detached instance

It doesn’t matter if the item object is modified before or after it’s passed to update(). The important thing here is that the call to update() is reattaching the detached instance to the new Session (and persistence context). Hibernate always treats the object as dirty and schedules an SQL UPDATE., which will be executed during flush. You can see the same unit of work in figure 9.8.

You may be surprised and probably hoped that Hibernate could know that you modified the detached item’s description (or that Hibernate should know you did not modify anything). However, the new Session and its fresh persistence context don’t have this information. Neither does the detached object contain some internal list of all the modifications you’ve made. Hibernate has to assume that an

Reattaching a detached object

Figure 9.8

Reattaching a detached object

UDPATE in the database is needed. One way to avoid this UDPATE statement is to configure the class mapping of Item with the select-before-update=”true” attribute. Hibernate then determines whether the object is dirty by executing a SELECT statement and comparing the object’s current state to the current database state.

If you’re sure you haven’t modified the detached instance, you may prefer another method of reattachment that doesn’t always schedule an update of the database.

Reattaching an unmodified detached instance

A call to lock() associates the object with the Session and its persistence context without forcing an update, as shown in listing 9.7.

Listing 9.7 Reattaching a detached instance with lock()

 Listing 9.7 Reattaching a detached instance with lock()

In this case, it does matter whether changes are made before or after the object has been reattached. Changes made before the call to lock() aren’t propagated to the database, you use it only if you’re sure the detached instance hasn’t been modified. This method only guarantees that the object’s state changes from detached to persistent and that Hibernate will manage the persistent object again. Of course, any modifications you make to the object once it’s in managed persistent state require updating of the database.

We discuss Hibernate lock modes in the next topic. By specifying Lock-Mode.NONE here, you tell Hibernate not to perform a version check or obtain any database-level locks when reassociating the object with the Session. If you specified LockMode.READ, or LockMode.UPGRADE, Hibernate would execute a SELECT statement in order to perform a version check (and to lock the row(s) in the database for updating).

Making a detached object transient

Finally, you can make a detached instance transient, deleting its persistent state from the database, as in listing 9.8.

Listing 9.8 Making a detached object transient using delete()

Listing 9.8 Making a detached object transient using delete()

This means you don’t have to reattach (with update() or lock()) a detached instance to delete it from the database. In this case, the call to delete() does two things: It reattaches the object to the Session and then schedules the object for deletion, executed on tx.commit(). The state of the object after the delete() call is removed.

Reattachment of detached objects is only one possible way to transport data between several Sessions. You can use another option to synchronize modifications to a detached instance with the database, through merging of its state.

Merging the state of a detached object

Merging of a detached object is an alternative approach. It can be complementary to or can replace reattachment. Merging was first introduced in Hibernate to deal with a particular case where reattachment was no longer sufficient (the old name for the merge() method in Hibernate 2.x was saveOrUpdateCopy()). Look at the following code, which tries to reattach a detached object:

tmp9D-60_thumb

Given is a detached item object with the database identity 1234. After modifying it, you try to reattach it to a new Session. However, before reattachment, another instance that represents the same database row has already been loaded into the persistence context of that Session. Obviously, the reattachment through update() clashes with this already persistent instance, and a NonUniqueObjectEx-ception is thrown. The error message of the exception is A persistent instance with the same database identifier is already associated with the Session! Hibernate can’t decide which object represents the current state.

You can resolve this situation by reattaching the item first; then, because the object is in persistent state, the retrieval of item2 is unnecessary. This is straightforward in a simple piece of code such as the example, but it may be impossible to refactor in a more sophisticated application. After all, a client sent the detached object to the persistence layer to have it managed, and the client may not (and shouldn’t) be aware of the managed instances already in the persistence context.

You can let Hibernate merge item and item2 automatically:

tmp9D-61_thumb

Look at this unit of work in figure 9.9.

  Merging a detached instance into a persistent instance

Figure 9.9

Merging a detached instance into a persistent instance

The merge(item) call © results in several actions. First, Hibernate checks whether a persistent instance in the persistence context has the same database identifier as the detached instance you’re merging. In this case, this is true: item and item2, which were loaded with get() ©, have the same primary key value.

If there is an equal persistent instance in the persistence context, Hibernate copies the state of the detached instance onto the persistent instance ©. In other words, the new description that has been set on the detached item is also set on the persistent item2.

If there is no equal persistent instance in the persistence context, Hibernate loads it from the database (effectively executing the same retrieval by identifier as you did with get()) and then merges the detached state with the retrieved object’s state. This is shown in figure 9.10.

Merging a detached instance into an implicitly loaded persistent instance

Figure 9.10

Merging a detached instance into an implicitly loaded persistent instance

If there is no equal persistent instance in the persistence context, and a lookup in the database yields no result, a new persistent instance is created, and the state of the merged instance is copied onto the new instance. This new object is then scheduled for insertion into the database and returned by the merge() operation.

An insertion also occurs if the instance you passed into merge() was a transient instance, not a detached object.

The following questions are likely on your mind:

■ What exactly is copied from item to item2? Merging includes all value-typed properties and all additions and removals of elements to any collection.

■ What state is item in? Any detached object you merge with a persistent instance stays detached. It doesn’t change state; it’s unaffected by the merge operation. Therefore, item and the other two references aren’t the same in Hibernate’s identity scope. (The first two identity checks in the last example.) However, item2 and item3 are identical references to the same persistent in-memory instance.

■ Why is item3 returned from the merge() operation? The merge() operation always returns a handle to the persistent instance it has merged the state into. This is convenient for the client that called merge(), because it can now either continue working with the detached item object and merge it again when needed, or discard this reference and continue working with item3. The difference is significant: If, before the Session completes, subsequent modifications are made to item2 or item3 after merging, the client is completely unaware of these modifications. The client has a handle only to the detached item object, which is now getting stale. However, if the client decides to throw away item after merging and continue with the returned item3, it has a new handle on up-to-date state. Both item and item2 should be considered obsolete after merging.

Merging of state is slightly more complex than reattachment. We consider it an essential operation you’ll likely have to use at some point if you design your application logic around detached objects. You can use this strategy as an alternative for reattachment and merge every time instead of reattaching. You can also use it to make any transient instance persistent. As you’ll see later in this topic, this is the standardized model ofJava Persistence; reattachment isn’t supported.

We haven’t paid much attention so far to the persistence context and how it manages persistent objects.

Managing the persistence context

The persistence context does many things for you: automatic dirty checking, guaranteed scope of object identity, and so on. It’s equally important that you know some of the details of its management, and that you sometimes influence what goes on behind the scenes.

Controlling the persistence context cache

The persistence context is a cache of persistent objects. Every object in persistent state is known to the persistence context, and a duplicate, a snapshot of each persistent instance, is held in the cache. This snapshot is used internally for dirty checking, to detect any modifications you made to your persistent objects.

Many Hibernate users who ignore this simple fact run into an OutOfMemory-Exception. This is typically the case when you load thousands of objects in a Session but never intend to modify them. Hibernate still has to create a snapshot of each object in the persistence context cache and keep a reference to the managed object, which can lead to memory exhaustion. (Obviously, you should execute a bulk data operation if you modify thousands of objects.

The persistence context cache never shrinks automatically. To reduce or regain the memory consumed by the persistence context in a particular unit of work, you have to do the following:

■ Keep the size of your persistence context to the necessary minimum. Often, many persistent instances in your Session are there by accident— for example, because you needed only a few but queried for many. Make objects persistent only if you absolutely need them in this state; extremely large graphs can have a serious performance impact and require significant memory for state snapshots. Check that your queries return only objects you need. As you’ll see later in the topic, you can also execute a query in Hibernate that returns objects in read-only state, without creating a persistence context snapshot.

■ You can call session.evict(object) to detach a persistent instance manually from the persistence context cache. You can call session.clear() to detach all persistent instances from the persistence context. Detached objects aren’t checked for dirty state; they aren’t managed.

■ With session.setReadOnly(object, true), you can disable dirty checking for a particular instance. The persistence context will no longer maintain the snapshot if it’s read-only. With session.setReadOnly(object, false), you can re-enable dirty checking for an instance and force the recreation of a snapshot. Note that these operations don’t change the object’s state.

At the end of a unit of work, all the modifications you made have to be synchronized with the database through SQL DML statements. This process is called flushing of the persistence context.

Flushing the persistence context

The Hibernate Session implements write-behind. Changes to persistent objects made in the scope of a persistence context aren’t immediately propagated to the database. This allows Hibernate to coalesce many changes into a minimal number of database requests, helping minimize the impact of network latency. Another excellent side-effect of executing DML as late as possible, toward the end of the transaction, is shorter lock durations inside the database.

For example, if a single property of an object is changed twice in the same persistence context, Hibernate needs to execute only one SQL UPDATE. Another example of the usefulness of write-behind is that Hibernate is able to take advantage of the JDBC batch API when executing multiple UPDATE, INSERT, or DELETE statements.

The synchronization of a persistence context with the database is called flushing. Hibernate flushes occur at the following times:

■ When a Transaction on the Hibernate API is committed

■ Before a query is executed

■ When the application calls session.flush() explicitly

Flushing the Session state to the database at the end of a unit of work is required in order to make the changes durable and is the common case. Note that automatic flushing when a transaction is committed is a feature of the Hibernate API! Committing a transaction with the JDBC API doesn’t trigger a flush. Hibernate doesn’t flush before every query. If changes are held in memory that would affect the results of the query, Hibernate synchronizes first by default.

You can control this behavior by explicitly setting the Hibernate FlushMode via a call to session.setFlushMode(). The default flush mode is FlushMode.AUTO and enables the behavior described previously. If you chose FlushMode.COMMIT, the persistence context isn’t flushed before query execution (it’s flushed only when you call Transaction.commit() or Session.flush() manually). This setting may expose you to stale data: Modifications you make to managed objects only in memory may conflict with the results of the query. By selecting Flush-Mode.MANUAL, you may specify that only explicit calls to flush() result in synchronization of managed state with the database.

Controlling the FlushMode of a persistence context will be necessary later in the topic, when we extend the context to span a conversation.

Repeated flushing of the persistence context is often a source for performance issues, because all dirty objects in the persistence context have to be detected at flush-time. A common cause is a particular unit-of-work pattern that repeats a query-modify-query-modify sequence many times. Every modification leads to a flush and a dirty check of all persistent objects, before each query. A Flush-Mode.COMMIT may be appropriate in this situation.

Always remember that the performance of the flush process depends in part on the size of the persistence context—the number of persistent objects it manages. Hence, the advice we gave for managing the persistence context, in the previous section, also applies here.

You’ve now seen the most important strategies and some optional ones for interacting with objects in a Hibernate application and what methods and operations are available on a Hibernate Session. If you plan to work only with Hibernate APIs, you can skip the next section and go directly to the next topic and read about transactions. If you want to work on your objects with Java Persistence and/or EJB 3.0 components, read on.

The Java Persistence API

We now store and load objects with the Java Persistence API. This is the API you use either in a Java SE application or with EJB 3.0 components, as a vendor-independent alternative to the Hibernate native interfaces.

You’ve read the first sections of this topic and know the object states defined by JPA and how they’re related to Hibernate’s. Because the two are similar, the first part of this topic applies no matter what API you’ll choose. It follows that the way you interact with your objects, and how you manipulate the database, are also similar. So, we also assume that you have learned the Hibernate interfaces in the previous section (you also miss all the illustrations if you skip the previous section; we won’t repeat them here). This is important for another reason: JPA provides a subset of functionality of the superset of Hibernate native APIs. In other words, there are good reasons to fall back to native Hibernate interfaces whenever you need to. You can expect that the majority of the functionality you’ll need in an application is covered by the standard, and that this is rarely necessary.

As in Hibernate, you store and load objects with JPA by manipulating the current state of an object. And, just as in Hibernate, you do this in a unit of work, a set of operations considered to be atomic. (We still haven’t covered enough ground to explain all about transactions, but we will soon.)

To begin a unit of work in a Java Persistence application, you need to get an EntityManager (the equivalent to the Hibernate Session). However, where you open a Session from a SessionFactory in a Hibernate application, a Java Persistence application can be written with managed and unmanaged units of work. Let’s keep this simple, and assume that you first want to write a JPA application that doesn’t benefit from EJB 3.0 components in a managed environment.

Storing and loading objects

The term unmanaged refers to the possibility to create a persistence layer with Java Persistence that runs and works without any special runtime environment. You can use JPA without an application server, outside of any runtime container, in a plain Java SE application. This can be a servlet application (the web container doesn’t provide anything you’d need for persistence) or a simple main() method. Another common case is local persistence for desktop applications, or persistence for two-tiered systems, where a desktop application accesses a remote database tier (although there is no good reason why you can’t use a lightweight modular application server with EJB 3.0 support in such a scenario).

Beginning a unit of work in Java SE

In any case, because you don’t have a container that could provide an EntityMan-ager for you, you need to create one manually. The equivalent of the Hibernate

SessionFactory is theJPA EntityManagerFactory:

tmp9D-64_thumb

The first line of code is part of your system configuration. You should create one EntityManagerFactory for each persistence unit you deploy in a Java Persistence application. “Using Hibernate EntityManager,” so we won’t repeat it here. The next three lines are equivalent to how you’d begin a unit of work in a stand-alone Hibernate application: First, an EntityManager is created, and then a transaction is started. To familiarize yourself with EJB 3.0 jargon, you can call this EntityManager application-managed. The transaction you started here also has a special description: It’s a resource-local transaction. You’re controlling the resources involved (the database in this case) directly in your application code; no runtime container takes care of this for you.

The EntityManager has a fresh persistence context assigned when it’s created. In this context, you store and load objects.

Making an entity instance persistent

An entity class is the same as one of your Hibernate persistent classes. Of course, you’d usually prefer annotations to map your entity classes, as a replacement of Hibernate XML mapping files. After all, the (primary) reason you’re using Java Persistence is the benefit of standardized interfaces and mappings.

Let’s create a new instance of an entity and bring it from transient into persistent state:

tmp9D-65_thumbtmp9D-66_thumb

This code should look familiar if you’ve followed the earlier sections of this topic. The transient item entity instance becomes persistent as soon as you call per-sist() on it; it’s now managed in the persistence context. Note that persist() doesn’t return the database identifier value of the entity instance (this little difference, compared to Hibernate’s save() method.

FAQ Should I usepersist() on the Session? The Hibernate Session interface also features a persist() method. It has the same semantics as the persist() operation of JPA. However, there’s an important difference between the two operations with regard to flushing. During synchronization, a Hibernate Session doesn’t cascade the persist() operation to associated entities and collections, even if you mapped an association with this option. It’s only cascaded to entities that are reachable when you call persist()! Only save() (and update()) are cascaded at flush-time if you use the Session API. In a JPA application, however, it’s the other way round: Only persist() is cascaded at flush-time.

Managed entity instances are monitored. Every modification you make to an instance in persistent state is at some point synchronized with the database (unless you abort the unit of work). Because the EntityTransaction is managed by the application, you need to do the commit() manually. The same rule applies to the application-controlled EntityManager: You need to release all resources by closing it.

Retrieving an entity instance

The EntityManager is also used to query the database and retrieve persistent entity instances. Java Persistence supports sophisticated query features (which we’ll cover later in the topic). The most basic is as always the retrieval by identifier:

tmp9D-67_thumbtmp9D-68_thumb

You don’t need to cast the returned value of the find() operation; it’s a generic method. and its return type is set as a side effect of the first parameter. This is a minor but convenient benefit of the Java Persistence API—Hibernate native methods have to work with older JDKs that don’t support generics.

The retrieved entity instance is in a persistent state and can now be modified inside the unit of work or be detached for use outside of the persistence context. If no persistent instance with the given identifier can be found, find() returns null. The find() operation always hits the database (or a vendor-specific transparent cache), so the entity instance is always initialized during loading. You can expect to have all of its values available later in detached state.

If you don’t want to hit the database, because you aren’t sure you’ll need a fully initialized instance, you can tell the EntityManager to attempt the retrieval of a placeholder:

tmp9D-69_thumb

This operation returns either the fully initialized item (for example, if the instance was already available in the current persistence context) or a proxy (a hollow placeholder).

As soon as you try to access any property of the item that isn’t the database identifier property, an additional SELECT is executed to fully initialize the placeholder. This also means you should expect an EntityNotFoundException at this point (or even earlier, when getReference() is executed). A logical conclusion is that if you decide to detach the item reference, no guarantees are made that it will be fully initialized (unless, of course, you access one of its nonidentifier properties before detachment).

Modifying a persistent entity instance

An entity instance in persistent state is managed by the current persistence context. You can modify it and expect that the persistence context flushes the necessary SQL DML at synchronization time. This is the same automatic dirty checking feature provided by the Hibernate Session:

tmp9D-70_thumb

A persistent entity instance is retrieved by its identifier value. Then, you modify one of its mapped properties (a property that hasn’t been annotated with @Tran-sient or the transient Java keyword). In this code example, the next synchronization with the database occurs when the resource-local transaction is committed. The Java Persistence engine executes the necessary DML, in this case an UDPATE.

Making a persistent entity instance transient

If you want to remove the state of an entity instance from the database, you have to make it transient. Use the remove() method on your EntityManager:

tmp9D-71_thumb

The semantics of the remove() Java Persistence method are the same as the delete() method’s on the Hibernate Session. The previously persistent object is now in removed state, and you should discard any reference you’re holding to it in the application. An SQL DELETE is executed during the next synchronization of the persistence context. The JVM garbage collector detects that the item is no longer referenced by anyone and finally deletes the last trace of the object. However, note that you can’t call remove() on an entity instance in detached state, or an exception will be thrown. You have to merge the detached instance first and then remove the merged object (or, alternatively, get a reference with the same identifier, and remove that).

Flushing the persistence context

All modifications made to persistent entity instances are synchronized with the database at some point, a process called flushing. This write-behind behavior is the same as Hibernate’s and guarantees the best scalability by executing SQL DML as late as possible.

The persistence context of an EntityManager is flushed whenever commit() on an EntityTransaction is called. All the previous code examples in this section of the topic have been using that strategy. However, JPA implementations are allowed to synchronize the persistence context at other times, if they wish.

Hibernate, as a JPA implementation, synchronizes at the following times:

■ When an EntityTransaction is committed

■ Before a query is executed

■ When the application calls em.flush() explicitly

These are the same rules we explained for native Hibernate in the previous section. And as in native Hibernate, you can control this behavior with a JPA interface, the FlushModeType:

tmp9D-72_thumb

Switching the FlushModeType to COMMIT for an EntityManager disables automatic synchronization before queries; it occurs only when the transaction is committed or when you flush manually. The default FlushModeType is AUTO.

Just as with native Hibernate, controlling the synchronization behavior of a persistence context will be important functionality for the implementation of conversations, which we’ll attack later.

You now know the basic operations ofJava Persistence, and you can go ahead and store and load some entity instances in your own application. “Starting a Java Persistence project,” and map some classes to your database schema with annotations. Write a main() method that uses an EntityManager and an EntityTransaction; we think you’ll soon see how easy it is to use Java Persistence even without EJB 3.0 managed components or an application server.

Let’s discuss how you work with detached entity instances.

Working with detached entity instances

We assume you already know how a detached object is defined (if you don’t, read the first section of this topic again). You don’t necessarily have to know how you’d work with detached objects in Hibernate, but we’ll refer you to earlier sections if a strategy with Java Persistence is the same as in native Hibernate.

First, let’s see again how entity instances become detached in a Java Persistence application.

JPA persistence context scope

You’ve used Java Persistence in a Java SE environment, with application-managed persistence contexts and transactions. Every persistent and managed entity instance becomes detached when the persistence context is closed. But wait—we didn’t tell you when the persistence context is closed.

If you’re familiar with native Hibernate, you already know the answer: The persistence context ends when the Session is closed. The EntityManager is the equivalent in JPA; and by default, if you created the EntityManager yourself, the persistence context is scoped to the lifecycle of that EntityManager instance. Look at the following code:

tmp9D-73_thumb

In the first transaction, you retrieve an Item object. The transaction then completes, but the item is still in persistent state. Hence, in the second transaction, you not only load a User object, but also update the modified persistent item when the second transaction is committed (in addition to an update for the dirty user instance).

Just like in native Hibernate code with a Session, the persistence context begins with createEntityManager() and ends with close().

Closing the persistence context isn’t the only way to detach an entity instance.

Manual detachment of entity instances

An entity instance becomes detached when it leaves the persistence context. A method on the EntityManager allows you to clear the persistence context and detach all persistent instances:

tmp9D-74_thumb

After the item is retrieved, you clear the persistence context of the EntityManager. All entity instances that have been managed by that persistence context are now detached. The modification of the detached instance isn’t synchronized with the database during commit.

FAQ Where is eviction of individual instances? The Hibernate Session API features the evict(object) method. Java Persistence doesn’t have this capability. The reason is probably only known by some expert group members—we can’t explain it. (Note that this is a nice way of saying that experts couldn’t agree on the semantics of the operation.) You can only clear the persistence context completely and detach all persistent objects. You have to fall back to the Session API as described, if you want to evict individual instances from the persistence context.

Obviously you also want to save any modifications you made to a detached entity instance at some point.

Merging detached entity instances

Whereas Hibernate offers two strategies, reattachment and merging, to synchronize any changes of detached objects with the database, Java Persistence only offers the latter. Let’s assume you’ve retrieved an item entity instance in a previous persistence context, and now you want to modify it and save these modifications.

tmp9D-75_thumbtmp9D-76_thumb

The item is retrieved in a first persistence context and merged, after modification in detached state, into a new persistence context. The merge() operation does several things:

First, the Java Persistence engine checks whether a persistent instance in the persistence context has the same database identifier as the detached instance you’re merging. Because, in our code examples, there is no equal persistent instance in the second persistence context, one is retrieved from the database through lookup by identifier. Then, the detached entity instance is copied onto the persistent instance. In other words, the new description that has been set on the detached item is also set on the persistent mergedItem, which is returned from the merge() operation.

If there is no equal persistent instance in the persistence context, and a lookup by identifier in the database is negative, the merged instance is copied onto a fresh persistent instance, which is then inserted into the database when the second persistence context is synchronized with the database.

Merging of state is an alternative to reattachment (as provided by native Hibernate). Refer to our earlier discussion of merging with Hibernate in section 9.3.2, “Merging the state of a detached object”; both APIs offer the same semantics, and the notes there apply for JPA mutatis mutandis.

You’re now ready to expand your Java SE application and experiment with the persistence context and detached objects in Java Persistence. Instead of only storing and loading entity instances in a single unit of work, try to use several and try to merge modifications of detached objects. Don’t forget to watch your SQL log to see what’s going on behind the scenes.

Once you’ve mastered basic Java Persistence operations with Java SE, you’ll probably want to do the same in a managed environment. The benefits you get from JPA in a full EJB 3.0 container are substantial. No longer do you have to manage EntityManager and EntityTransaction yourself. You can focus on what you’re supposed to do: load and store objects.

Using Java Persistence in EJB components

A managed runtime environment implies some sort of container. Your application components live inside this container. Most containers these days are implemented using an interception technique, method calls on objects are intercepted and any code that needs to be executed before (or after) the method is applied. This is perfect for any cross-cutting concerns:: Opening and closing an EntityManager, because you need it inside the method that is called, is certainly one. Your business logic doesn’t need to be concerned with this aspect. Transaction demarcation is another concern a container can take care of for you. (You’ll likely find other aspects in any application.)

Unlike older application servers from the EJB 2.x era, containers that support EJB 3.0 and other Java EE 5.0 services are easy to install and use. “Introducing EJB components,” to prepare your system for the following section. Furthermore, the EJB 3.0 programming model is based on plain Java classes. You shouldn’t be surprised if you see us writing many EJBs in this topic; most of the time, the only difference from a plain Jav-aBean is a simple annotation, a declaration that you wish to use a service provided by the environment the component will run in. If you can’t modify the source code and add an annotation, you can turn a class into an EJB with an XML deployment descriptor. Hence, (almost) every class can be a managed component in EJB 3.0, which makes it much easier for you to benefit from Java EE 5.0 services.

The entity classes you’ve created so far aren’t enough to write an application. You also want stateless or stateful session beans, components that you can use to encapsulate your application logic. Inside these components, you need the services of the container: for example, you usually want the container to inject an EntityManager, so that you can load and store entity instances.

Injecting an EntityManager

Remember how you create an instance of an EntityManager in Java SE? You have to open it from an EntityManagerFactory and close it manually. You also have to begin and end a resource-local transaction with the EntityTransaction interface.

In an EJB 3.0 server, a container-managed EntityManager is available through dependency injection. Consider the following EJB session bean that implements a particular action in the CaveatEmptor application:

tmp9D-77_thumbtmp9D-78_thumb

It’s a stateless action, and it implements the ManageAuction interface. These details of stateless EJBs aren’t our concern at this time, what is interesting is that you can access the EntityManager in the findAuctionByName() method of the action. The container automatically injects an instance of an EntityManager into the em field of the bean, before the action method executes. The visibility of the field isn’t important for the container, but you need to apply the @Persistence-Context annotation to indicate that you want the container’s service. You could also create a public setter method for this field and apply the annotation on this method. This is the recommended approach if you also plan to set the Entity-Manager manually—for example, during integration or functional testing.

The injected EntityManager is maintained by the container. You don’t have to flush or close it, nor do you have to start and end a transaction—in the previous example you tell the container that the findAuctionByName() method of the session bean requires a transaction. (This is the default for all EJB session bean methods.) A transaction must be active when the method is called by a client (or a new transaction is started automatically). When the method returns, the transaction either continues or is committed, depending on whether it was started for this method.

The persistence context of the injected container-managed EntityManager is bound to the scope of the transaction, Hence, it’s flushed automatically and closed when the transaction ends. This is an important difference, if you compare it with earlier examples that showed JPA in Java SE! The persistence context there wasn’t scoped to the transaction but to the EntityManager instance you closed explicitly. The transaction-scoped persistence context is the natural default for a stateless bean, as you’ll see when you focus on conversation implementation and transactions later, in the following topics.

A nice trick that obviously works only with JBoss EJB 3.0 is the automatic injection of a Session object, instead of an EntityManager:

tmp9D-79_thumb

This is mostly useful if you have a managed component that would rely on the Hibernate API.

Here is a variation that works with two databases—that is, two persistence units:

tmp9D-80_thumb

The unitName refers to the configured and deployed persistence unit. If you work with one database (one EntityManagerFactory or one SessionFactory), you don’t need to declare the name of the persistence unit for injection. Note that EntityManager instances from two different persistence units aren’t sharing the same persistence context. Naturally, both are independent caches of managed entity objects, but that doesn’t mean they can’t participate in the same system transaction.

If you write EJBs with Java Persistence, the choice is clear: You want the EntityManager with the right persistence context injected into your managed components by the container. An alternative you’ll rarely use is the lookup of a container-managed EntityManager.

Looking up an EntityManager

Instead of letting the container inject an EntityManager on your field or setter method, you can look it up from JNDI when you need it:

tmp9D-81_thumb

Several things are happening in this code snippet: First, you declare that you want the component environment of the bean populated with an EntityManager and that the name of the bound reference is supposed to be em/auction. The full name in JNDI is java:comp/env/em/auction—the java:comp/env/ part is the so called bean-naming context. Everything in that subcontext of JNDI is bean-dependent. In other words, the EJB container reads this annotation and knows that it has to bind an EntityManager for this bean only, at runtime when the bean executes, under the namespace in JNDI that is reserved for this bean.

You look up the EntityManager in your bean implementation with the help of the SessionContext. The benefit of this context is that it automatically prefixes the name you’re looking for with java:comp/env/; hence, it tries to find the reference in the bean’s naming context, and not the global JNDI namespace. The @Resource annotation instructs the EJB container to inject the SessionContext for you.

A persistence context is created by the container when the first method on the EntityManager is called, and it’s flushed and closed when the transaction ends— when the method returns.

Injection and lookup are also available if you need an EntityManagerFactory.

Accessing an EntityManagerFactory

An EJB container also allows you to access an EntityManagerFactory for a persistence unit directly. Without a managed environment, you have to create the EntityManagerFactory with the help of the Persistence bootstrap class. In a

The unitName attribute is optional and only required if you have more than one configured persistence unit (several databases). The EntityManager you created from the injected factory is again application-managed—the container won’t flush this persistence context, nor close it. It’s rare that you mix container-managed factories with application-managed EntityManager instances, but doing so is useful if you need more control over the lifecycle of an EntityManager in an EJB component.

You may create an EntityManager outside of any JTA transaction boundaries; for example, in an EJB method that doesn’t require a transaction context. It’s then your responsibility to notify the EntityManager that a JTA transaction is active, when needed, with the joinTransaction() method. Note that this operation doesn’t bind or scope the persistence context to the JTA transaction; it’s only a hint that switches the EntityManager to transactional behavior internally.

The previous statements aren’t complete: If you close() the EntityManager, it doesn’t immediately close its persistence context, if this persistence context has been associated with a transaction. The persistence context is closed when the transaction completes. However, any call of the closed EntityManager throws an exception (except for the getTransaction() method in Java SE and the isOpen() method). You can switch this behavior with the hibernate. ejb.discard_ pc_on_close configuration setting. You don’t have to worry about this if you never call the EntityManager outside of transaction boundaries.

EntityManagerFactory:

tmp9D-82_thumb

Another reason for accessing your EntityManagerFactory may be that you want to access a particular vendor extension on this interface.

You can also look up an EntityManagerFactory if you bind it to the EJB’s naming context first:

tmp9D-83_thumb

Again, there is no particular advantage if you compare the lookup technique with automatic injection.

Summary

We’ve covered a lot of ground in this topic. You now know that the basic interfaces in Java Persistence aren’t much different from those provided by Hibernate. Loading and storing objects is almost the same. The scope of the persistence context is slightly different, though; in Hibernate, it’s by default the same as the Session scope. In Java Persistence, the scope of the persistence context varies, depending on whether you create an EntityManager yourself, or you let the container manage and bind it to the current transaction scope in an EJB component.

Table 9.1 shows a summary you can use to compare native Hibernate features and Java Persistence.

We’ve already talked about conversations in an application and how you can design them with detached objects or with an extended persistence context. Although we haven’t had time to discuss every detail, you can probably already see that working with detached objects requires discipline (outside of the guaranteed scope of object identity) and manual reattachment or merging. In practice, and from our experience over the years, we recommend that you consider detached objects a secondary option and that you first look at an implementation of conversations with an extended persistence context.

Table 9.1 Hibernate and JPA comparison chart


Hibernate Core

Java Persistence and EJB 3.0

Hibernate defines and relies on four object states: transient, persistent, removed, and detached.

Equivalent object states are standardized and defined in EJB 3.0.

Detached objects can be reattached to a new persistence context or merged onto persistent instances.

Only merging is supported with the Java Persistence management interfaces.

At flush-time, the save() and update() operations can be cascaded to all associated and reachable instances. The persist() operation can only be cascaded to reachable instances at call-time.

At flush-time, the persist() operation can be cascaded to all associated and reachable instances. If you fall back to the Session API, save() and update() are only cascaded to reachable instances at call-time.

A get() hits the database; a load() may return a proxy.

A find() hits the database; a getReference() may return a proxy.

Dependency injection of a Session in an EJB works only in JBoss Application Server.

Dependency injection of an EntityManager works in all EJB 3.0 components.

Unfortunately, we still don’t have all the pieces to write a really sophisticated application with conversations. You may especially miss more information about transactions. The next topic covers transaction concepts and interfaces.

Next post:

Previous post: