Querying with HQL and JPA QL (Hibernate)

 

Queries are the most interesting part of writing good data access code. A complex query may require a long time to get right, and its impact on the performance of the application can be tremendous. On the other hand, writing queries becomes much easier with more experience, and what seemed difficult at first is only a matter of knowing some of the more advanced features.

If you’ve been using handwritten SQL for a number of years, you may be concerned that ORM will take away some of the expressiveness and flexibility that you’re used to. This isn’t the case with Hibernate and Java Persistence.

Hibernate’s powerful query facilities allow you to express almost everything you commonly (or even uncommonly) need to express in SQL, but in object-oriented terms—using classes and properties of classes.

We’ll show you the differences between native Hibernate queries and the standardized subset in Java Persistence. You may also use this topic as a reference; hence some sections are written in a less verbose style but show many small code examples for different use cases. We also sometimes skip optimizations in the CaveatEmptor application for better readability. For example, instead of referring to the MonetaryAmount value type, we use a BigDecimal amount in comparisons.

First, we show you how queries are executed. Don’t let yourself be distracted by the queries; we discuss them soon.

Creating and running queries

■ Hibernate Query Language (HQL), and the subset standardized as JPA QL:

tmp1DC-72_thumb

■ Direct SQL with or without automatic mapping of resultsets to objects:

tmp1DC-73_thumb

A query must be prepared in application code before execution. So, querying involves several distinct steps:

Let’s start with a few examples so you understand the basic usage. In earlier topics, we mentioned that there are three ways to express queries in Hibernate:

1 Create the query, with any arbitrary restriction or projection of data that you want to retrieve.

2 Bind runtime arguments to query parameters; the query can be reused with changing settings.

3 Execute the prepared query against the database and retrieval of data. You can control how the query is executed and how data should be retrieved into memory (all at once or piecemeal, for example).

Hibernate and Java Persistence offer query interfaces and methods on these interfaces to prepare and execute arbitrary data retrieval operations.

Preparing a query

The org.hibernate.Query and org.hibernate.Criteria interfaces both define several methods for controlling execution of a query. In addition, Query provides methods for binding concrete values to query parameters. To execute a query in your application, you need to obtain an instance of one of these interfaces, using the Session.

Java Persistence specifies the javax.persistence.Query interface. The standardized interface isn’t as rich as the native Hibernate API, but offers all necessary methods to execute a query in different ways and to bind arguments to query parameters. Unfortunately, the useful Hibernate Criteria API has no equivalent in Java Persistence, although it’s highly likely that a similar query interface will be added in a future version of the standard.

Creating a query object

To create a new Hibernate Query instance, call either createQuery() or create-SQLQuery() on a Session. The createQuery() method prepares an HQL query:

tmp1DC74_thumb

createSQLQuery() is used to create an SQL query using the native syntax of the underlying database:

tmp1DC75_thumb

In both cases, Hibernate returns a newly instantiated Query object that may be used to specify exactly how a particular query should be executed and to allow execution of the query. So far, no SQL has been sent to the database.

To obtain a Criteria instance, call createCriteria(), passing the class of the objects you want the query to return. This is also called the root entity of the criteria query, the User in this example:

Criteria crit = session.createCriteria(User.class);

The Criteria instance may be used in the same way as a Query object—but it’s also used to construct the object-oriented representation of the query, by adding Criterion instances and navigating associations to new Criterias.

With the Java Persistence API, your starting point for queries is the EntityMan-ager. To create a javax.persistence.Query instance for JPA QL, call create-Query():

tmp1DC76_thumb

To create a native SQL query, use createNativeQuery() :

tmp1DC-77_thumb

The way you define the returned objects from a native query is slightly different than in Hibernate (there are no placeholders in the query here).

After you’ve created the query, you prepare it for execution by setting various options.

Paging the result

A commonly used technique is pagination. Users may see the result of their search request (for example, for specific Items) as a page. This page shows a limited subset (say, 10 Items) at a time, and users can navigate to the next and previous pages manually. In Hibernate, Query and Criteria interfaces support this pagination of the query result:

tmp1DC78_thumb

The call to setMaxResults(10) limits the query resultset to the first 10 objects (rows) returned by the database. In this Criteria query, the requested page starts in the middle of the resultset:

tmp1DC79_thumb

Starting from the fortieth object, you retrieve the next 20 objects. Note that there is no standard way to express pagination in SQL—Hibernate knows the tricks to make this work efficiently on your particular database. You can even add this flexible pagination option to an SQL query. Hibernate will rewrite your SQL for pagination:

tmp1DC-80_thumb

You may use the method-chaining coding style (methods return the receiving object instead of void) with the Query and Criteria interfaces, rewriting the two previous examples as follows:

tmp1DC-81_thumb

Chaining method calls is less verbose and is supported by many Hibernate APIs. The Java Persistence query interfaces also support pagination and method chaining for JPA QL and native SQL queries with the javax.persistence.Query interface:

tmp1DC82_thumb

Next in preparing your query is the setting of any runtime parameters. Considering parameter binding

Without runtime parameter binding, you have to write bad code:

tmp1DC83_thumb

You should never write this code because a malicious user could search for the following item description—that is, by entering the value of search in a search dialog box as

tmp1DC84_thumb

As you can see, the original queryString is no longer a simple search for a string but also executes a stored procedure in the database! The quote characters aren’t escaped; hence the call to the stored procedure is another valid expression in the query. If you write a query like this, you open up a major security hole in your application by allowing the execution of arbitrary code on your database. This is known as an SQL injection security issue. Never pass unchecked values from user input to the database! Fortunately, a simple mechanism prevents this mistake.

The JDBC driver includes functionality for safely binding values to SQL parameters. It knows exactly what characters in the parameter value to escape, so that the previous vulnerability doesn’t exist. For example, the quote characters in the given search are escaped and are no longer treated as control characters but as a part of the search string value. Furthermore, when you use parameters, the database is able to efficiently cache precompiled prepared statements, improving performance significantly.

There are two approaches to parameter binding: using positional or using named parameters. Hibernate and Java Persistence support both options, but you can’t use both at the same time for a particular query.

With named parameters, you can rewrite the query as

tmp1DC85_thumb

The colon followed by a parameter name indicates a named parameter. Then, bind a value to the search parameter:

tmp1DC86_thumb

Because searchString is a user-supplied string variable, you call the setString() method of the Query interface to bind it to the named parameter (:search). This code is cleaner, much safer, and performs better, because a single compiled SQL statement can be reused if only bind parameters change.

Often, you’ll need multiple parameters:

tmp1DC87_thumb

The same query and code looks slightly different in Java Persistence:

tmp1DC88_thumb

The setParameter() method is a generic operation that can bind all types of arguments, it only needs a little help for temporal types (the engine needs to know if you want only the date, time, or a full timestamp bound). Java Persistence supports only this method for binding of parameters (Hibernate, by the way, has it too).

Hibernate, on the other hand, offers many other methods, some of them for completeness, others for convenience, that you can use to bind arguments to query parameters.

Using Hibernate parameter binding

You’ve called setString() and setDate() to bind arguments to query parameters. The native Hibernate Query interface provides similar convenience methods for binding arguments of most of the Hibernate built-in types: everything from setInteger() to setTimestamp() and setLocale(). They’re mostly optional; you can rely on the setParameter() method to figure out the right type automatically (except for temporal types).

A particularly useful method is setEntity(), which lets you bind a persistent entity (note that setParameter() is smart enough to understand even that automatically):

tmp1DC89_thumb

However, there is also a generic method that allows you to bind an argument of any Hibernate type:

tmp1DC90_thumb

If you have a JavaBean with seller and description properties, you can call the setProperties() method to bind the query parameters. For example, you can pass query parameters in an instance of the Item class itself:

tmp1DC91_thumb

The setProperties() binding matches names of JavaBean properties to named parameters in the query string, internally calling setParameter() to guess the Hibernate type and bind the value. In practice, this turns out to be less useful than it sounds, because some common Hibernate types aren’t guessable (temporal types, in particular).

The parameter binding methods of Query are null-safe. So the following code is legal:

tmp1DC92_thumb

Java Persistence also supports positional parameters:

tmp1DC93_thumb

tmp1DC94_thumb

However, the result of this code is almost certainly not what you intended! The resulting SQL will contain a comparison like USERNAME = null, which always evaluates to null in SQL ternary logic. Instead, you must use the is null operator:

tmp1DC95_thumb

Using positional parameters

If you prefer, you can use positional parameters instead in Hibernate and Java Persistence:

Every change of the position of the bind parameters requires a change to the parameter binding code. This leads to fragile and maintenance-intensive code. Our recommendation is to avoid positional parameters. They may be more convenient if you build complex queries programmatically, but the Criteria API is a much better alternative for that purpose.

If you have to use positional parameters, remember that Hibernate starts counting at 0, but Java Persistence starts at 1, and that you have to add a number to each question mark in a JPA QL query string. They have different legacy roots: Hibernate in JDBC, Java Persistence in older versions of EJB QL.

In addition to bind parameters, you often want to apply other hints that influence how a query is executed.

Setting query hints

Let’s assume that you make modifications to persistent objects before executing a query. These modifications are only present in memory, so Hibernate (and Java Persistence providers) flushes the persistence context and all changes to the database before executing your query. This guarantees that the query runs on current data and that no conflict between the query result and the in-memory objects can occur.

This is sometimes impractical: for example, if you execute a sequence that consists of many query-modify-query-modify operations, and each query is retrieving a different dataset than the one before. In other words, you don’t need to flush your modifications to the database before executing a query, because conflicting results aren’t a problem. Note that the persistence context provides repeatable read for entity objects, so only scalar results of a query are a problem anyway.

You can disable flushing of the persistence context with setFlushMode() on a Session or EntityManager. Or, if you want to disable flushing only before a particular query, you can set a FlushMode on the Query (Hibernate and JPA) object:

tmp1DC96_thumb

Not only is this code much less self-documenting than the alternative with named parameters, it’s also much more vulnerable to easy breakage if you change the query string slightly:

tmp1DC97_thumb

tmp1DC98_thumbtmp1DC99_thumb

Hibernate won’t flush the persistence context before executing any of these queries.

Another optimization is a fine-grained org.hibernate.CacheMode for a particular query result.  If Hibernate retrieves an object by identifier, it looks it up in the first-level persistence context cache and, if enabled, the second-level cache region for this entity. The same happens when you execute a query that returns entity instances: During marshaling of the query result, Hibernate tries to resolve all entity instances by looking them up from the persistence context cache first—it ignores the entity data of the query result if the entity instance is in the persistence context cache. And, if the retrieved entity instance wasn’t in any cache, Hibernate puts it there after the query completes. You can control this behavior with a CacheMode on a query:

tmp1DC100_thumb

A CacheMode.IGNORE, for example, tells Hibernate not to interact with the second-level cache for any entity returned by this query. In other words, any Item retrieved by this query isn’t put in the second-level cache. Setting this cache mode is useful if you execute a query that shouldn’t update the second-level cache, maybe because the data you’re retrieving is only relevant for a particular situation, and shouldn’t exhaust the available space in the cache region.

We talked about the control of the persistence context and how you can reduce memory consumption and prevent long dirty checking cycles. One way to disable dirty checking for a particular persistent object is to set session.setRead-Only(object, true) (the EntityManager doesn’t support this API).

You can tell Hibernate that all entity objects returned by a query should be considered read-only (although not detached):

tmp1DC101_thumb

All Item objects returned by this query are in persistent state, but no snapshot for automatic dirty checking is enabled in the persistence context. Hibernate doesn’t persist any modifications automatically, unless you disable read-only mode with session.setReadOnly(object, false).

You can control how long a query is allowed to run by setting a timeout:

tmp1DC102_thumb

This method has the same semantics and consequences as the setQueryTime-out() method on a JDBC Statement. Also related to the underlying JDBC is the fetch size:

tmp1DC103_thumb

The JDBC fetch size is an optimization hint for the database driver; it may not result in any performance improvement if the driver doesn’t implement this functionality. If it does, it can improve the communication between the JDBC client and the database, by retrieving many rows in one batch when the client operates on a query result (that is, on a ResultSet). Because Hibernate is working with the ResultSet behind the scenes, this hint can improve data retrieval if you execute a query with list() —which you’ll do soon.

When you optimize an application you often have to read complex SQL logs. We highly recommend that you enable hibernate.use_sql_comments; Hibernate will then add a comment to each SQL statement it writes to the logs. You can set a custom comment for a particular query with setComment():

tmp1DC104_thumb

The hints you’ve been setting so far are all related to Hibernate or JDBC handling. Many developers (and DBAs) consider a query hint to be something completely different. In SQL, a query hint is a comment in the SQL statement that contains an instruction for the SQL optimizer of the database management system. For example, if the developer or DBA thinks that the execution plan selected by the database optimizer for a particular SQL statement isn’t the fastest, they use a hint to force a different execution plan. Hibernate and Java Persistence don’t support arbitrary SQL hints with an API; you’ll have to fall back to native SQL and write your own SQL statement—you can of course execute that statement with the provided APIs.

(With some database-management systems you can control the optimizer with an SQL comment at the beginning of an SQL statement; in that case, use Query.setComment() to add the hint. In other scenarios, you may be able to write an org.hibernate.Interceptor and manipulate an SQL statement in the onPrepareStatement(sql) method before it’s sent to the database.)

Finally, you can control whether a query should force a pessimistic lock in the database management system—a lock that is held until the end of the database transaction:

tmp1DC105_thumb

Both queries, if supported by your database dialect, result in an SQL statement that includes a … FOR UPDATE operation (or the equivalent, if supported by the database system and dialect). Currently, pessimistic locking isn’t available (but it’s planned as a Hibernate extension hint) on the Java Persistence query interface. Let’s assume that queries are now prepared, so you can run them.

Executing a query

Once you’ve created and prepared a Query or Criteria object, you’re ready to execute it and retrieve the result into memory. Retrieving the whole result into memory in one turn is the most common way to execute a query; we call this listing. Some other options are available that we also discuss next:, iterating and scrolling. Scrolling is about as useful as iteration: You rarely need one of these options. We’d guess that more than 90 percent of all query execution relies on the list() and getResultList() methods in a regular application. First, the most common case.

Listing all results

In Hibernate, the list() method executes the query and returns the results as a java.util.List:

tmp1DC106_thumb

The Criteria interface also supports this operation:

tmp1DC107_thumb

In both cases, one or several SELECTs statements are executing immediately, depending on your fetch plan. If you map any associations or collections as non-lazy, they must be fetched in addition to the data you want retrieved with your query. All these objects are loaded into memory, and any entity objects that are retrieved are in persistent state and added to the persistence context.

Java Persistence offers a method with the same semantics, but a different name:

tmp1DC108_thumb

With some queries you know that the result is only a single instance—for example, if you want only the highest bid. In this case, you can read it from the result list by index, result.get(0). Or, you can limit the number of returned rows with setMaxResult(1). Then you may execute the query with the uniqueResult() method, because you know only one object will be returned:

tmp1DC109_thumb

If the query returns more than one object, an exception is thrown. If the query result is empty, a null is returned. This also works in Java Persistence, again with a different method name (and, unfortunately, an exception is thrown if the result is empty):

tmp1DC110_thumb

Retrieving all results into memory is the most common way to execute a query. Hibernate supports some other methods that you may find interesting if you want to optimize the memory consumption and execution behavior of a query.

Iterating through the results

The Hibernate Query interface also provides the iterate() method to execute a query. It returns the same data as list(), but relies on a different strategy for retrieving the results.

When you call iterate() to execute a query, Hibernate retrieves only the primary key (identifier) values of entity objects in a first SQL SELECT, and then tries to find the rest of the state of the objects in the persistence context cache, and (if enabled) the second-level cache. Consider the following code:

tmp1DC-111_thumb

This query results in execution of at least one SQL SELECT, with all columns of the CATEGORY table included in the SELECT clause:

tmp1DC112_thumb

tmp1DC113_thumb

You can use the iterate() method for this:

tmp1DC114_thumb

If you expect that categories are already cached in the persistence context or the second-level cache, then you need only the identifier value (the key to the cache). This therefore reduces the amount of data fetched from the database. The following SQL is slightly more efficient:

The initial query retrieves only Category primary key values. You then iterate through the result; Hibernate looks up each Category object in the current persistence context and, if enabled, in the second-level cache. If a cache miss occurs,

Hibernate executes an additional SELECT for each turn, retrieving the full Category object by its primary key from the database.

In most cases, this is a minor optimization. It’s usually much more important to minimize row reads than to minimize column reads. Still, if your object has large string fields, this technique may be useful to minimize data packets on the network and therefore latency. It should be clear that it’s really effective only if the second-level cache region for the iterated entity is enabled. Otherwise it produces n+1 selects!

Hibernate keeps the iterator open until you finish iteration through all results or until the Session is closed. You can also close it explicitly with org.hiber-nate.Hibernate.close(iterator).

Also note that Hibernate Criteria and Java Persistence, at the time of writing, don’t support this optimization.

Another optimized way to execute a query is scrolling through the result.

Scrolling with database cursors

Plain JDBC provides a feature called scrollable resultsets. This technique uses a cursor that is held on the database management system. The cursor points to a particular row in the result of a query, and the application can move the cursor forward and backward. You can even jump to a particular row with the cursor.

One of the situations where you should scroll through the results of a query instead of loading them all into memory involves resultsets that are too large to fit into memory. Usually you try to further restrict the result by tightening the conditions in the query. Sometimes this isn’t possible, maybe because you need all of the data, but want to retrieve it in several steps.

You’ve already seen scrolling in “Writing a procedure with batch updates”  how to implement procedures that work on batches of data, because this is where it’s most useful. The following example shows an overview of other interesting options on the ScrollableResults interface:

tmp1DC115_thumbtmp1DC116_thumb

This code doesn’t make much sense; it displays the most interesting methods on the ScrollableResults interface. You can set the cursor to the first and last Item object in the result, or get the Item the cursor is currently pointing to with get(). You can go to a particular Item by jumping to a position with setRowNumber() or scroll backward and forward with next() and previous(). Another option is scrolling forward and backward by an offset, with scroll().

Hibernate Criteria queries can also be executed with scrolling instead of list(); the returned ScrollableResults cursor works the same. Note that you absolutely must close the cursor when you’re done working with it, before you end the database transaction. Here is a Criteria example that shows the opening of a cursor:

tmp1DC117_thumb

The ScrollMode constants of the Hibernate API are equivalent to the constants in plain JDBC. In this case, the constant ensures that your cursor can only move forward. This may be required as a precaution; some JDBC drivers don’t support scrolling backward. Other available modes are ScrollMode.SCROLL_INSENSITIVE and ScrollMode.SCROLL_SENSITIVE. An insensitive cursor won’t expose you to modified data while the cursor is open (effectively guaranteeing that no dirty reads, unrepeatable reads, or phantom reads can slip into your resultset). On the other hand, a sensitive cursor exposes newly committed data and committed modifications to you while you work on your resultset. Note that the Hibernate persistence context cache still provides repeatable read for entity instances, so only modified scalar values you project in the resultset can be affected by this setting.

So far, the code examples we’ve shown all embed query string literals in Java code. This isn’t unreasonable for simple queries, but once you begin considering complex queries that must be split over multiple lines, this gets a bit unwieldy.

Using named queries

We don’t like to see HQL or JPA QL string literals scattered all over the Java code, unless really necessary. Hibernate lets you externalize query strings to the mapping metadata, a technique that is called named queries. This lets you store all queries related to a particular persistent class (or a set of classes) encapsulated with the other metadata of that class in an XML mapping file. Or, if you use annotations, you can create named queries as metadata of a particular entity class or put them into an XML deployment descriptor. The name of the query is used to call it from application code.

Calling a named query

In Hibernate, the getNamedQuery() method obtains a Query instance for a named query:

tmp1DC118_thumb

In this example, you call the named query findItemsByDescription and bind a string argument to the named parameter desc. Java Persistence also supports named queries:

tmp1DC119_thumb

Named queries are global—that is, the name of a query is considered to be a unique identifier for a particular SessionFactory or persistence unit. How and where they’re defined, in XML mapping files or annotations, is no concern of your application code. Even the query language doesn’t matter.

Defining a named query in XML metadata

You can place a named query inside any <hibernate-mapping> element in your XML metadata. In larger applications, we recommend isolating and separating all named queries into their own file. Or, you may want some queries to be defined in the same XML mapping file as a particular class.

The <query> defines a named HQL or JPA QL query:

tmp1DC120_thumb

You should wrap the query text into a CDATA instruction so the XML parser doesn’t get confused by any characters in your query string that may accidentally be considered XML (such as the less than operator).

If you place a named query definition inside a <class> element, instead of the root, it’s prefixed with the name of the entity class; for example, findItemsByDescription is then callable as auction.model.Item.findItemsByDescription. Otherwise, you need to make sure the name of the query is globally unique.

Named queries don’t have to be HQL or JPA QL strings; they may even be native SQL queries—and your Java code doesn’t need to know the difference:

tmp1DC121_thumb

This is useful if you think you may want to optimize your queries later by fine-tuning the SQL. It’s also a good solution if you have to port a legacy application to Hibernate, where SQL code was isolated from the hand-coded JDBC routines. With named queries, you can easily port the queries one-by-one to mapping files. We’ll have much more to say about native SQL queries in the next topic.

Defining a named query with annotations

The Java Persistence standard specifies the @NamedQuery and @NamedNativeQuery annotations. You can either place these annotations into the metadata of a particular class or into JPA XML descriptor file. Note that the query name must be globally unique in all cases; no class or package name is automatically prefixed.

Let’s assume you consider a particular named query to belong to a particular entity class:

All query hints that you set earlier with an API can also be set declaratively:

tmp1DC122_thumb

tmp1DC-123_thumb

A much more common solution is the encapsulation of queries in the orm.xml deployment descriptor:

tmp1DC-124_thumb

You can see that the Java Persistence descriptor supports an extension point: the hints element of a named-query definition. You can use it to set Hibernate-spe-cific hints, as you did earlier programmatically with the Query interface.

Native SQL queries have their own element and can also be either defined inside or outside an entity mapping:

tmp1DC-125_thumb

Embedding native SQL is much more powerful than we’ve shown so far (you can define arbitrary resultset mappings). We’ll get back to other SQL emedding options in the next topic.

We leave it up to you if you want to utilize the named query feature. However, we consider query strings in the application code (except if they’re in annotations) to be the second choice; you should always externalize query strings if possible.

You now know how to create, prepare, and execute a query with the Hibernate and Java Persistence APIs and metadata. It’s time to learn the query languages and options in more detail. We start with HQL and JPA QL.

Basic HQL and JPA QL queries

Let’s start with some simple queries to get familiar with the HQL syntax and semantics. We apply selection to name the data source, restriction to match records to the criteria, and projection to select the data you want returned from a query.

TRY IT Testing Hibernate queries—The Hibernate Tools for the Eclipse IDE support a Hibernate Console view. You can test your queries in the console window, and see the generated SQL and the result immediately.

You’ll also learn JPA QL in this section, because it’s a subset of the functionality of HQL—we’ll mention the differences when necessary.

When we talk about queries in this section, we usually mean SELECT statements, operations that retrieve data from the database. HQL also supports UPDATE, DELETE, and even INSERT .. SELECT statements,  “Bulk statements with HQL and JPA QL.” JPA QL includes UPDATE and DELETE. We won’t repeat these bulk operations here and will focus on SELECT statements. However, keep in mind that some differences between HQL and JPA QL may also apply to bulk operations—for example, whether a particular function is portable.

SELECT statements in HQL work even without a SELECT clause; only FROM is required. This isn’t the case in JPA QL, where the SELECT clause isn’t optional. This isn’t a big difference in practice; almost all queries require a SELECT clause, whether you write JPA QL or HQL. However, we start our exploration of queries with the FROM clause, because in our experience it’s easier to understand. Keep in mind that to translate these queries to JPA QL, you must theoretically add a SELECT clause to complete the statement, but Hibernate lets you execute the query anyway if you forget it (assuming SELECT *).

Selection

The simplest query in HQL is a selection (note that we don’t mean SELECT clause or statement here, but from where data is selected) of a single persistent class:

tmp1DC126_thumb

This query generates the following SQL:

tmp1DC127_thumb

Using aliases

Usually, when you select a class to query from using HQL or JPA QL, you need to assign an alias to the queried class to use as a reference in other parts of the query:

You assign the alias item to queried instances of the Item class, allowing you to refer to their property values later in the code (or query). To remind yourself of the similarity, we recommend that you use the same naming convention for aliases that you use for temporary variables (camelCase, usually). However, we may use shorter aliases in some of the examples in this topic, such as i instead of item, to keep the printed code readable.

FAQ Are HQL and JPA QL case sensitive? We never write HQL and JPA QL keywords in uppercase; we never write SQL keywords in uppercase either. It looks ugly and antiquated—most modern terminals can display both uppercase and lowercase characters. However, HQL and JPA QL aren’t case-sensitive for keywords, so you can write FROM Item AS item if you like shouting.

Polymorphic queries

tmp1DC128_thumb

The as keyword is always optional. The following is equivalent:

tmp1DC129_thumb

Think of this as being a bit like the temporary variable declaration in the following Java code:

tmp1DC130_thumb

HQL and JPA QL, as object-oriented query languages, support polymorphic queries— queries for instances of a class and all instances of its subclasses, respectively. You already know enough HQL and JPA QL to be able to demonstrate this. Consider the following query:

tmp1DC131_thumb

The class named in the from clause doesn’t even need to be a mapped persistent class; any class will do! The following query returns all persistent objects:

tmp1DC132_thumb

tmp1DC133_thumb

This returns objects of the type BillingDetails, which is an abstract class. In this case, the concrete objects are of the subtypes of BillingDetails: CreditCard and BankAccount. If you want only instances of a particular subclass, you may use

Note that Java Persistence doesn’t standardize polymorphic queries that use non-mapped interfaces. However, this works with Hibernate EntityManager.

Polymorphism applies not only to classes named explicitly in the FROM clause, but also to polymorphic associations, as you’ll see later in this topic.

We’ve discussed the FROM clause, now let’s move on to the other parts of HQL and JPA QL.

Restriction

Usually, you don’t want to retrieve all instances of a class. You must be able express constraints on the property values of objects returned by the query. This is called restriction. The WHERE clause is used to express a restriction in SQL, HQL, and JPA QL. These expressions may be as complex as you need to narrow down the piece of data you’re looking for. Note that restriction doesn’t only apply to SELECT statements; you also use a restriction to limit the scope of an UPDATE or DELETE operation.

Of course, this also works for interfaces—this query returns all serializable persistent objects:

tmp1DC134_thumb

Likewise, the following criteria query returns all persistent objects (yes, you can select all the tables of your database with such a query):

tmp1DC135_thumb

This is a typical WHERE clause that restricts the results to all User objects with the given email address:

tmp1DC136_thumb

You can include literals in your statements and conditions, with single quotes. Other commonly used literals in HQL and JPA QL are TRUE and FALSE:

tmp1DC137_thumb

A restriction is expressed using ternary logic. The WHERE clause is a logical expression that evaluates to true, false, or null for each tuple of objects. You construct

tmp1DC138_thumb

Notice that the constraint is expressed in terms of a property, email, of the User class, and that you use an object-oriented notion for this. The SQL generated by this query is

logical expressions by comparing properties of objects to other properties or literal values using the built-in comparison operators.

FAQ What is ternary logic? A row is included in an SQL resultset if and only if the WHERE clause evaluates to true. In Java, notNullObject==null evaluates to false and null==null evaluates to true. In SQL, NOT_NULL_COL-UMN=null and null=null both evaluate to null, not true. Thus, SQL needs a special operator, IS NULL, to test whether a value is null. This ternary logic is a way of handling expressions that may be applied to null column values. Treating null not as a special marker but as a regular value is an SQL extension to the familiar binary logic of the relational model. HQL and JPA QL have to support this ternary logic with ternary operators.

Let’s walk through the most common comparison operators. Comparison expressions

HQL and JPA QL support the same basic comparison operators as SQL. Here are a few examples that should look familiar if you know SQL:

tmp1DC139_thumb

Because the underlying database implements ternary logic, testing for null values requires some care. Remember that null = null doesn’t evaluate to true in SQL, but to null. All comparisons that use a null operand evaluate to null. (That’s why you usually don’t see the null literal in queries.) HQL and JPA QL provide an SQL-style IS [NOT] NULL operator:

tmp1DC140_thumb

This query returns all users with no email address and items which are sold.

The LIKE operator allows wildcard searches, where the wildcard symbols are % and _, as in SQL:

tmp1DC141_thumb

This expression restricts the result to users with a firstname starting with a capital G. You may also negate the LIKE operator, for example, in a substring match expression:

tmp1DC142_thumb

The percentage symbol stands for any sequence of characters; the underscore can be used to wildcard a single character. You can define an escape character if you want a literal percentage or underscore:

You can see the precedence of operators in table 14.1, from top to bottom.

The listed operators and their precedence are the same in HQL and JPA QL. The arithmetic operators, for example multiplication and addition, are self-explanatory. You’ve already seen how binary comparison expressions have the same semantics as their SQL counterpart and how to group and combine them with logical operators. Let’s discuss collection handling.

tmp1DC143_thumb

This query returns all users with a firstname that starts with %Foo. HQL and JPA QL support arithmetic expressions:

tmp1DC144_thumb

Logical operators (and parentheses for grouping) are used to combine expressions:

tmp1DC145_thumb

Table 14.1 HQL and JPA QL operator precedence


Operator

Description

tmp1DC-146

Navigation path expression operator

tmp1DC-147

Unary positive or negative signing (all unsigned numeric values are considered positive)

tmp1DC-148

Regular multiplication and division of numeric values

tmp1DC-149

Regular addition and subtraction of numeric values

tmp1DC-150

Binary comparison operators with SQL semantics

tmp1DC-151

Binary operators for collections in HQL and JPA QL

tmp1DC-152

Logical operators for ordering of expression evaluation

Expressions with collections

All expressions in the previous sections included only single-valued path expressions: user.email, bid.amount, and so on. You can also use path expressions that end in collections in the WHERE clause of a query, with the right operators.

For example, let’s assume you want to restrict your query result by the size of a collection:

tmp1DC153_thumb

Another common expression is concatenation—although SQL dialects are different here, HQL and JPA QL support a portable concat() function:

tmp1DC154_thumb

Also typical is an expression that requires the size of a collection:

tmp1DC155_thumb

JPA QL standardizes the most common functions, as summarized in table 14.2.

tmp1DC156_thumb

This query returns all Item instances that have an element in their bids collection. You can also express that you require a particular element to be present in a collection:

tmp1DC157_thumb

This query returns Item and Category instances—usually you add a SELECT clause and project only one of the two entity types. It returns an Item instance with the primary key ’123′ (a literal in single quotes) and all Category instances this Item instance is associated with. (Another trick you use here is the special .id path; this field always refers to the database identifier of an entity, no matter what the name of the identifier property is.)

There are many other ways to work with collections in HQL and JPA QL. For example, you can use them in function calls.

Calling functions

An extremely powerful feature of HQL is the ability to call SQL functions in the WHERE clause. If your database supports user-defined functions (most do), you can put this to all sorts of uses, good or evil. For the moment, let’s consider the usefulness of the standard ANSI SQL functions UPPER() and LOWER(). These can be used for case-insensitive searching:

Function

Applicability

tmp1DC-158

String values; returns a string value

tmp1DC-159

String values; returns a string value

tmp1DC-160

String values (offset starts at 1); returns a string value

tmp1DC-161

Trims spaces on BOTH sides of s if no char or other specification is given; returns a string value

tmp1DC-162

String value; returns a numeric value

tmp1DC-163

Searches for position of ss in s starting at offset; returns a numeric value

tmp1DC-164

Numeric values; returns an absolute of same type as input, square root as double, and the remainder of a division as an integer

tmp1DC-165

Collection expressions; returns an integer, or 0 if empty

All the standardized JPA QL functions may be used in the WHERE and HAVING clauses of a query (the latter you’ll see soon). The native HQL is a bit more flexible. First, it offers additional portable functions, as shown in table 14.3.

Table 14.3 Additional HQL functions

Function

Applicability

tmp1DC-166

Returns the number of bits in s

tmp1DC-167

Returns the date and/or time of the database management system machine

tmp1DC-168

Extracts the time and date from a temporal argument

tmp1DC-169

Casts a given type t to a Hibernate Type

tmp1DC-170

Returns the index of joined collection element

tmp1DC-171

Returns an element or index of indexed collections (maps, lists, arrays)

tmp1DC-172

Extends HQL with other functions in a dialect

Most of these HQL functions translate into a counterpart in SQL you’ve probably used before. This translation table is customizable and extendable with an org.hibernate.Dialect. Check the source code of the dialect you’re using for your database; you’ll probably find many other SQL functions already registered there for immediate use in HQL. Keep in mind that every function that isn’t included in the org.hibernate.Dialect superclass may not be portable to other database management systems!

Another recent addition to the Hibernate API is the addSqlFunction() method on the Hibernate Configuration API:

tmp1DC173_thumb

You specify ascending and descending order using asc or desc:

tmp1DC174_thumb

You may order by multiple properties:

tmp1DC175_thumb

You now know how to write a FROM, WHERE, and ORDER BY clause. You know how to select the entities you want to retrieve instances of and the necessary expressions

tmp1DC176_thumb

This operation adds the SQL function lpad to HQL. See the Javadoc of Standard-SQLFunction and its subclasses for more information.

HQL even tries to be smart when you call a function that wasn’t registered for your SQL dialect: Any function that is called in the WHERE clause of an HQL statement, and that isn’t known to Hibernate, is passed directly to the database, as an SQL function call. This works great if you don’t care about database portability, but it requires that you keep your eyes open for nonportable functions if you do care.

Finally, before we move on to the SELECT clause in HQL and JPA QL, let’s see how results can be ordered.

Ordering query results

All query languages provide some mechanism for ordering query results. HQL and JPA QL provide an ORDER BY clause, similar to SQL. This query returns all users, ordered by username:

and operations to restrict and order the result. All you need now is the ability to project the data of this result to what you need in your application.

Projection

The SELECT clause performs projection in HQL and JPA QL. It allows you to specify exactly which objects or properties of objects you need in the query result.

Simple projection of entities and scalar values

For example, consider the following HQL query:

tmp1DC177_thumb

This is a valid HQL query, but it’s invalid in JPA QL—the standard requires that you use a SELECT clause. Still, the same result that is implicit from this product of Item and Bid can also be produced with an explicit SELECT clause. This query returns ordered pairs of Item and Bid instances:

tmp1DC-178_thumb

This query returns a List of Object[]. At index 0 is the Item, and at index 1 is the Bid. Because this is a product, the result contains every possible combination of Item and Bid rows found in the two underlying tables. Obviously, this query isn’t useful, but you shouldn’t be surprised to receive a collection of Object[] as a query result.

The following explicit SELECT clause also returns a collection of Object[] s:

tmp1DC179_thumb

The Object[] s returned by this query contain a Long at index 0, a String at index 1, and a BigDecimal or MonetaryAmount at index 2. These are scalar values, not entity instances. Therefore, they aren’t in any persistent state, like an entity instance would be. They aren’t transactional and obviously aren’t checked automatically for dirty state. We call this kind of query a scalar query.

Getting distinct results

When you use a SELECT clause, the elements of the result are no longer guaranteed to be unique. For example, item descriptions aren’t unique, so the following query may return the same description more than once:

tmp1DC180_thumb

It’s difficult to see how it could be meaningful to have two identical rows in a query result, so if you think duplicates are likely, you normally use the DISTINCT keyword:

tmp1DC181_thumb

This eliminates duplicates from the returned list of Item descriptions. Calling functions

It’s also (for some Hibernate SQL dialects) possible to call database specific SQL functions from the SELECT clause. For example, the following query retrieves the current date and time from the database server (Oracle syntax), together with a property of Item:

tmp1DC182_thumb

The technique of database functions in the SELECT clause isn’t limited to database-dependent functions. it works with other more generic (or standardized) SQL functions as well:

tmp1DC183_thumb

This query returns Object[]s with the starting and ending date of an item auction, and the name of the item all in uppercase.

In particular, it’s possible to call SQL aggregate functions, which we’ll cover later in this topic. Note, however, that the Java Persistence standard and JPA QL don’t guarantee that any function that isn’t an aggregation function can be called in the SELECT clause. Hibernate and HQL allow more flexibility, and we think other products that support JPA QL will provide the same freedom to a certain extent. Also note that functions that are unknown to Hibernate aren’t passed on to the database as an SQL function call, as they are in the WHERE clause. You have to register a function in your org.hibernate.Dialect to enable it for the SELECT clause in HQL.

The previous sections should get you started with basic HQL and JPA QL. It’s time to look at the more complex query options, such as joins, dynamic fetching, subselects, and reporting queries.

Joins, reporting queries, and subselects

It’s difficult to categorize some queries as advanced and others as basic. Clearly, the queries we’ve shown you in the previous sections of this topic aren’t going to get you far.

At the least you also need to know how joins work. The ability to arbitrarily join data is one of the fundamental strengths of relational data access. Joining data is also the basic operation that enables you to fetch several associated objects and collections in a single query. We now show you how basic join operations work and how you use them to write a dynamic fetching strategy.

Other techniques we’d consider advanced include nesting of statements with subselects and report queries that aggregate and group results efficiently.

Let’s start with joins and how they can be used for dynamic fetching.

Joining relations and associations

You use a join to combine data in two (or more) relations. For example, you may join the data in the ITEM and BID tables, as shown in figure 14.1. (Note that not all columns and possible rows are shown; hence the dotted lines.)

What most people think of when they hear the word join in the context of SQL databases is an inner join. An inner join is the most important of several types of joins and the easiest to understand. Consider the SQL statement and result in figure 14.2. This SQL statement is an ANSI-style inner join in the FROM clause.

If you join tables ITEM and BID with an inner join, using their common attributes (the ITEM_ID column), you get all items and their bids in a new result table. Note that the result of this operation contains only items that have bids. If you want all items, and NULL values instead of bid data when there is no corresponding bid, you use a (left) outer join, as shown in figure 14.3.

You can think of a table join as working as follows. First, you take a product of the two tables, by taking all possible combinations of ITEM rows with BID rows.

The ITEM and BID tables are obvious candidates for a join

Figure 14.1 The ITEM and BID tables are obvious candidates for a join operation.

The result table of an ANSI-style inner join of two tables

Figure 14.2

The result table of an ANSI-style inner join of two tables

Second, you filter these joined rows using a join condition. (Any good database engine has much more sophisticated algorithms to evaluate a join; it usually doesn’t build a memory-consuming product and then filters all rows.) The join condition is a boolean expression that evaluates to true if the joined row is to be included in the result. In case of the left outer join, each row in the (left) ITEM table that never satisfies the join condition is also included in the result, with NULL values returned for all columns of BID.

A right outer join retrieves all bids and null if a bid has no item—not a sensible query in this situation. Right outer joins are rarely used; developers always think from left to right and put the driving table first.

In SQL, the join condition is usually specified explicitly. (Unfortunately, it isn’t possible to use the name of a foreign key constraint to specify how two tables are to be joined.) You specify the join condition in the ON clause for an ANSI-style join or in the WHERE clause for a so-called theta-style join, where I.ITEM_ID = B.ITEM_ID.

We now discuss the HQL and JPA QL join options. Remember that both are based on and translated into SQL, so even if the syntax is slightly different you should always refer to the two examples shown earlier and verify that you understood what the resulting SQL and resultset looks like.

The result of an ANSI-style left outer join of two tables

Figure 14.3

The result of an ANSI-style left outer join of two tables

HQL and JPA QL join options

In Hibernate queries, you don’t usually specify a join condition explicitly. Rather, you specify the name of a mapped Java class association. This is basically the same feature we’d prefer to have in SQL, a join condition expressed with a foreign key constraint name. Because you’ve mapped most, if not all, foreign key relationships of your database schema in Hibernate, you can use the names of these mapped associations in the query language. This is really syntactical sugar, but it’s convenient.

For example, the Item class has an association named bids with the Bid class. If you name this association in a query, Hibernate has enough information in the mapping document to then deduce the table join expression. This helps make queries less verbose and more readable.

In fact, HQL and JPA QL provide four ways of expressing (inner and outer) joins:

■ An implicit association join

■ An ordinary join in the FROM clause

■ A fetch join in the FROM clause

■ A theta-style join in the WHERE clause

Later we show you how to write a join between two classes that don’t have an association defined (a theta-style join) and how to write ordinary and fetch joins in the FROM clause of a query.

Implicit association joins are common abbreviations. (Note that we decided to make the following examples easier to read and understand by often omitting the SELECT clause—valid in HQL, invalid in JPA QL.)

Implicit association joins

So far, you’ve used simple qualified property names like bid.amount and item.description in queries. HQL and JPA QL support multipart property path expressions with a dot notation for two different purposes:

■ Querying components

■ Expressing implicit association joins

The first use is straightforward:

tmp1DC187_thumb

You reference parts of the mapped component Address with a dot notation. No tables are joined in this query; the properties of the homeAddress component are

This query returns a List of Strings. Because duplicates don’t make much sense, you eliminate them with DISTINCT. all mapped to the same table together with the User data. You can also write a path expression in the SELECT clause:

tmp1DC188_thumb

The second usage of multipart path expressions is implicit association joining:

tmp1DC189_thumb

This results in an implicitjoin on the many-to-one associations from Bid to Item— the name of this association is item. Hibernate knows that you mapped this association with the ITEM_ID foreign key in the BID table and generates the SQL join condition accordingly. Implicit joins are always directed along many-to-one or one-to-one associations, never through a collection-valued association (you can’t write item.bids.amount).

Multiple joins are possible in a single path expression. If the association from Item to Category is many-to-one (instead of the current many-to-many), you can write

tmp1DC190_thumb

It’s more obvious if you express this query with explicit HQL and JPA QL joins in the FROM clause.

tmp1DC191_thumb

We frown on the use of this syntactic sugar for more complex queries. SQL joins are important, and especially when optimizing queries, you need to be able to see at a glance exactly how many of them there are. Consider the following query (again, using a many-to-one from Item to Category):

tmp1DC192_thumb

How many joins are required to express this in SQL? Even if you get the answer right, it takes more than a few seconds to figure out. The answer is three; the generated SQL looks something like this:

Joins expressed in the FROM clause

Hibernate differentiates between the purposes for joining. Suppose you’re querying Items. There are two possible reasons why you may be interested in joining them with Bids.

You may want to limit the item returned by the query on the basis of some criterion that should be applied to their Bids. For example, you may want all Items that have a bid of more than $100; hence this requires an inner join. You aren’t interested in items that have no bids so far.

On the other hand, you may be primarily interested in the Items, but you may want to execute an outer join just because you want to retrieve all the Bids for the queried Items in the same single SQL statement, something we called eager join fetching earlier. Remember that you prefer to map all associations lazy by default, so an eager, outer-join fetch query is used to override the default fetching strategy at runtime for a particular use case.

Let’s first write some queries that use inner joins for the purpose of restriction. If you want to retrieve Item instances and restrict the result to items that have bids with a certain amount, you have to assign an alias to a joined association:

tmp1DC-193_thumb

The query returns all combinations of associated Bids and Items as ordered pairs:

tmp1DC-194_thumb

tmp1DC195_thumb

This query assigns the alias i to the entity Item and the alias b to the joined Items bids. You then use both aliases to express restriction criteria in the WHERE clause. The resulting SQL is:

Instead of a List of Items, this query returns a List of Object[] arrays. At index 0 is the Item, and at index 1 is the Bid. A particular Item may appear multiple times, once for each associated Bid. These duplicate items are duplicate in-mem-ory references, not duplicate instances!

If you don’t want the Bids in the query result, you may specify a SELECT clause in HQL (it’s mandatory anyway for JPA QL). You use the alias in a SELECT clause to project only the objects you want:

tmp1DC196_thumb

Now the generated SQL looks like this:

tmp1DC197_thumb

The query result contains just Items, and because it’s an inner join, only Items that have Bids:

tmp1DC-198_thumb

As you can see, using aliases in HQL and JPA QL is the same for both direct classes and joined associations. You used a collection in the previous examples, but the syntax and semantics are the same for single-valued associations, such as many-to-one and one-to-one. You assign aliases in the FROM clause by naming the association and then use the aliases in the WHERE and possibly SELECT clause.

HQL and JPA QL offer an alternative syntax for joining a collection in the FROM clause and to assign it an alias. This IN() operator has its history in an older version of EJB QL. It’s semantics are the same as those of a regular collection join. You can rewrite the last query as follows:

tmp1DC199_thumb

The from Item i in(i.bids) b results in the same inner join as the earlier example with from Item i join i.bids b.

So far, you’ve only written inner joins. Outer joins are mostly used for dynamic fetching, which we’ll discuss soon. Sometimes you want to write a simple query with an outer join without applying a dynamic fetching strategy. For example, the following query is a variation of the first query and retrieves items and bids with a minimum amount:

tmp1DC200_thumb

The first thing that is new in this statement is the LEFT keyword. Optionally you can write LEFT OUTER JOIN and RIGHT OUTER JOIN, but we usually prefer the short form. The second change is the additional join condition following the WITH keyword. If you place the b.amount > 100 expression into the WHERE clause you’d restrict the result to Item instances that have bids. This isn’t what you want here: You want to retrieve items and bids, and even items that don’t have bids. By adding an additional join condition in the FROM clause, you can restrict the Bid instances and still retrieve all Item objects. This query again returns ordered pairs of Item and Bid objects. Finally, note that additional join conditions with the WITH keyword are available only in HQL; JPA QL supports only the basic outer join condition represented by the mapped foreign key association.

A much more common scenario in which outer joins play an important role is eager dynamic fetching.

Dynamic fetching strategies with joins

All queries you saw in the previous section have one thing in common: The returned Item instances have a collection named bids. This collection, if mapped as lazy=”true” (default), isn’t initialized, and an additional SQL statement is triggered as soon as you access it. The same is true for all single-ended associations, like the seller of each Item. By default, Hibernate generates a proxy and loads the associated User instance lazily and only on-demand.

What options do you have to change this behavior? First, you can change the fetch plan in your mapping metadata and declare a collection or single-valued association as lazy=”false”. Hibernate then executes the necessary SQL to guarantee that the desired network of objects is loaded at all times. This also means that a single HQL or JPA QL statement may result in several SQL operations!

On the other hand, you usually don’t modify the fetch plan in mapping metadata unless you’re absolutely sure that it should apply globally. You usually write a new fetch plan for a particular use case. This is what you already did by writing HQL and JPA QL statements; you defined a fetch plan with selection, restriction, and projection. The only thing that will make it more efficient is the right dynamic fetching strategy. For example, there is no reason why you need several SQL statements to fetch all Item instances and to initialize their bids collections, or to retrieve the seller for each Item. This can be done at the same time, with a join operation.

In HQL and JPA QL you can specify that an associated entity instance or a collection should be eagerly fetched with the FETCH keyword in the FROM clause:

tmp1DC201_thumb

This query returns all items with a description that contains the string “Foo” and all their bids collections in a single SQL operation. When executed, it returns a list of Item instances, with their bids collections fully initialized. This is quite different if you compare it to the ordered pairs returned by the queries in the previous section!

The purpose of a fetch join is performance optimization: You use this syntax only because you want eager initialization of the bids collections in a single SQL operation:

tmp1DC202_thumb

This query executes the following SQL:

tmp1DC203_thumb

tmp1DC204_thumb

An additional WITH clause wouldn’t make sense here. You can’t restrict the Bid instances: All the collections must be fully initialized.

You can also prefetch many-to-one or one-to-one associations, using the same syntax:

If you write JOIN FETCH. without LEFT, you get eager loading with an inner join (also if you use INNER JOIN FETCH ); a prefetch with an inner join, for example, returns Item objects with their bids collection fully initialized, but no Item objects that don’t have bids. Such a query is rarely useful for collections but can be used for a many-to-one association that isn’t nullable; for example, join fetch item.seller works fine.

Dynamic fetching in HQL and JPA QL is straightforward; however, you should remember the following caveats:

■ You never assign an alias to any fetch-joined association or collection for further restriction or projection. So left join fetch i.bids b where b = … is invalid, whereas left join fetch i.bids b join fetch b.bidder is valid.

■ You shouldn’t fetch more than one collection in parallel; otherwise you create a Cartesian product. You can fetch as many single-valued associated objects as you like without creating a product.

■ HQL and JPA QL ignore any fetching strategy you’ve defined in mapping metadata. For example, mapping the bids collection in XML with fetch=”join”, has no effect on any HQL or JPA QL statement. A dynamic fetching strategy ignores the global fetching strategy (on the other hand, the global fetch plan isn’t ignored—every nonlazy association or collection is guaranteed to be loaded, even if several SQL queries are needed).

■ If you eager-fetch a collection, duplicates may be returned. Look at figure 14.3: This is exactly the SQL operation that is executed for a select i from Item i join fetch i.bids HQL or JPA QL query. Each Item is duplicated on the left side of the result table as many times as related Bid data is present. The List returned by the HQL or JPA QL query preserves these duplicates as references. If you prefer to filter out these duplicates you need to either wrap the List in a Set (for example, with Set noDupes = new LinkedHashSet(resultList)) or use the DISTINCT keyword: select distinct i from Item i join fetch i.bids —note that in this case the DISTINCT doesn’t operate at the SQL level, but forces Hibernate to filter out duplicates in memory when marshaling the result into objects. Clearly, duplicates can’t be avoided in the SQL result.

■ Query execution options that are based on the SQL result rows, such as pagination with setMaxResults() /setFirstResult(), are semantically incorrect if a collection is eagerly fetched. If you have an eager fetched collection in your query, at the time of writing, Hibernate falls back to limiting the result in-memory, instead of using SQL. This may be less efficient, so we don’t recommend the use of JOIN FETCH with setMaxResults()/set-FirstResult(). Future versions of Hibernate may fall back to a different SQL query strategy (such as two queries and subselect fetching) if setMaxRe-sults() /setFirstResult() is used in combination with a JOIN FETCH.

This is how Hibernate implements dynamic association fetching, a powerful feature that is essential for achieving high performance in any application. Tuning the fetch plan and fetching strategy with queries is your first optimization, followed by global settings in mapping metadata when it becomes obvious that more and more queries have equal requirements. The last join option on the list is the theta-style join.

Theta-style joins

A product lets you retrieve all possible combinations of instances of two or more classes. This query returns all ordered pairs of Users and Category objects:

tmp1DC205_thumb

The join condition here is a comparison of username, present as an attribute in both classes. If both rows have the same username, they’re joined (with an inner join) in the result. The query result consists of ordered pairs:

tmp1DC206_thumb

Obviously, this isn’t usually useful. There is one case where it’s commonly used: theta-style joins.

In traditional SQL, a theta-style join is a Cartesian product together with a join condition in the WHERE clause, which is applied on the product to restrict the result.

In HQL and JPA QL, the theta-style syntax is useful when your join condition isn’t a foreign key relationship mapped to a class association. For example, suppose you store the User’s name in log records, instead of mapping an association from LogRecord to User. The classes don’t know anything about each other, because they aren’t associated. You can then find all the Users and their LogRecords with the following theta-style join:

tmp1DC-207_thumb

You can of course apply a SELECT clause to project only the data you’re interested in.

You probably won’t need to use the theta-style joins often. Note that it’s currently not possible in HQL or JPA QL to outer join two tables that don’t have a mapped association—theta-style joins are inner joins.

Finally, it’s extremely common to perform queries that compare primary key or foreign key values to either query parameters or other primary or foreign key values.

Comparing identifiers

If you think about identifier comparison in more object-oriented terms, what you’re really doing is comparing object references. HQL and JPA QL support the following:

tmp1DC208_thumb

On the other hand, the following theta-style join can’t be re-expressed as a FROM clause join:

tmp1DC209_thumb

tmp1DC210_thumb

In this query, i.seller refers to the foreign key to the USER table in the ITEM table (on the SELLER_ID column), and user refers to the primary key of the USER table (on the USER_ID column). This query uses a theta-style join and is equivalent to the much preferred

In this case, i.seller and b.bidder are both foreign keys of the USER table. Note that this is an important query in the application; you use it to identify people bidding for their own items.

You may also want to compare a foreign key value to a query parameter, perhaps to find all Comments from a User:

tmp1DC211_thumb

Alternatively, sometimes you prefer to express these kinds of queries in terms of identifier values rather than object references. An identifier value may be referred to by either the name of the identifier property (if there is one) or the special property name id. (Note that only HQL guarantees that id always refers to any arbitrarily named identifier property; JPA QL doesn’t.)

These queries are equivalent to the earlier queries:

tmp1DC212_thumb

However, you may now use the identifier value as a query parameter:

tmp1DC-213_thumb

Considering identifier attributes, there is a world of difference between the following queries:

tmp1DC214_thumb

The second query uses an implicit table join; the first has no joins at all!

This completes our discussion of queries that involve joins. You learned how to write a simple implicit inner join with dot notation and how to write an explicit inner or outer join with aliases in the FROM clause. We also looked at dynamic fetching strategies with outer and inner join SQL operations.

Our next topic is advanced queries that we consider to be mostly useful for reporting.

Reporting queries

Reporting queries take advantage of the database’s ability to perform efficient grouping and aggregation of data. They’re more relational in nature; they don’t always return entities. For example, instead of retrieving Item entities that are in persistent state (and automatically dirty checked), a report query may only retrieve the Item names and initial auction prices. If this is the only information you need (maybe even aggregated, the highest initial price in a category, and so on.) for a report screen, you don’t need transactional entity instances and can save the overhead of automatic dirty checking and caching in the persistence context.

HQL and JPA QL allow you to use several features of SQL that are most commonly used for reporting—although they’re also used for other things. In reporting queries, you use the SELECT clause for projection and the GROUP BY and HAVING clauses for aggregation.

Because we’ve already discussed the basic SELECT clause, we’ll go straight to aggregation and grouping.

Projection with aggregation functions

The aggregate functions that are recognized by HQL and standardized in JPA QL are count(), min(), max(), sum() and avg(). This query counts all the Items:

tmp1DC215_thumb

The result is returned as a Long:

tmp1DC216_thumb

The next variation of the query counts all Items which have a successfulBid (null values are eliminated):

tmp1DC217_thumb

This query calculates the total of all the successful Bids:

tmp1DC218_thumb

The query returns a BigDecimal, because the amount property is of type BigDeci-mal. The SUM() function also recognizes BigInteger property types and returns Long for all other numeric property types. Notice the use of an implicit join in the SELECT clause: You navigate the association (successfulBid) from Item to Bid by referencing it with a dot.

The result is an ordered pair of BigDecimals (two instances of BigDecimals, in an Object[] array).

The special COUNT(DISTINCT) function ignores duplicates:

tmp1DC219_thumb

When you call an aggregate function in the SELECT clause, without specifying any grouping in a GROUP BY clause, you collapse the result down to a single row, containing the aggregated value(s). This means that (in the absence of a GROUP BY clause) any SELECT clause that contains an aggregate function must contain only aggregate functions.

For more advanced statistics and reporting, you need to be able to perform grouping.

Grouping aggregated results

Just like in SQL, any property or alias that appears in HQL or JPA QL outside of an aggregate function in the SELECT clause must also appear in the GROUP BY clause. Consider the next query, which counts the number of users with each last name:

tmp1DC220_thumb

Look at the generated SQL:

tmp1DC221_thumb

tmp1DC222_thumb

This query returns ordered pairs of Item identifier and average bid amount values. Notice how you use the id special property to refer to the identifier of a

In this example, the u.lastname isn’t inside an aggregate function; you use it to group the result. You also don’t need to specify the property you like to count. The generated SQL automatically uses the primary key, if you use an alias that has been set in the FROM clause.

The next query finds the average bid amount for each item: persistent class, no matter what the real property name of the identifier is. (Again, this special property isn’t standardized in JPA QL.)

The next query returns the minimum and maximum bid amounts for a particular Item:

tmp1DC223_thumb

The next query counts the number of bids and calculates the average bid per unsold item:

tmp1DC224_thumb

That query uses an implicit association join. For an explicit ordinary join in the FROM clause (not a fetch join), you can re-express it as follows:

tmp1DC225_thumb

Sometimes, you want to further restrict the result by selecting only particular values of a group.

Restricting groups with having

The WHERE clause is used to perform the relational operation of restriction upon rows. The HAVING clause performs restriction upon groups.

For example, the next query counts users with each last name that begins with “A”:

tmp1DC226_thumb

The same rules govern the SELECT and HAVING clauses: Only grouped properties may appear outside of an aggregate function. The next query counts the number of bids per unsold item, returning results for only those items that have more than 10 bids:

tmp1DC227_thumb

Most report queries use a SELECT clause to choose a list of projected or aggregated properties. You’ve seen that when there is more than one property or alias listed in the SELECT clause, Hibernate returns the query results as tuples—each row of the query result list is an instance of Object[].

Utilizing dynamic instantiation

Tuples, especially common with report queries, are inconvenient, so HQL and JPA QL provide a SELECT NEW constructor call. In addition to creating new objects dynamically with this technique, you can also use it in combination with aggregation and grouping.

If you define a class called ItemBidSummary with a constructor that takes a Long, a Long, and a BigDecimal, the following query may be used:

tmp1DC228_thumb

In the result of this query, each element is an instance of ItemBidSummary, which is a summary of an Item, the number of bids for that item, and the average bid amount. Note that you have to write a fully qualified classname here, with a package name. unless the class has been imported into the HQL namespace.This approach is type-safe, and a data transfer class such as ItemBidSummary can easily be extended for special formatted printing of values in reports.

The ItemBidSummary class is a Java bean, it doesn’t have to be a mapped persistent entity class. On the other hand, if you use the SELECT NEW technique with a mapped entity class, all instances returned by your query are in transient state—so you can use this feature to populate several new objects and then save them.

Report queries can have an impact on the performance of your application. Let’s explore this issue some more.

Improving performance with report queries

The only time we have ever seen any significant overhead in Hibernate code compared to direct JDBC queries—and then only for unrealistically simple toy test cases—is in the special case of read-only queries against a local database. In this case, it’s possible for a database to completely cache query results in memory and respond quickly, so benchmarks are generally useless if the dataset is small: Plain SQL and JDBC are always the fastest option.

Hibernate, on the other hand, even with a small dataset, must still do the work of adding the resulting objects of a query to the persistence context cache (perhaps also the second-level cache) and manage uniqueness, and so on. If you ever wish to avoid the overhead of managing the persistence context cache, report queries give you a way to do this. The overhead of a Hibernate report query compared to direct SQL/JDBC isn’t usually measurable, even in unrealistic extreme cases, like loading one million objects from a local database without network latency.

Report queries using projection in HQL and JPA QL let you specify which properties you wish to retrieve. For report queries, you aren’t selecting entities in managed state, but only properties or aggregated values:

tmp1DC229_thumb

This query doesn’t return persistent entity instances, so Hibernate doesn’t add any persistent object to the persistence context cache. This means that no object must be watched for dirty state either.

Therefore, reporting queries result in faster release of allocated memory, because objects aren’t kept in the persistence context cache until the context is closed—they may be garbage collected as soon as they’re dereferenced by the application, after executing the report.

Almost always, these considerations are extremely minor, so don’t go out and rewrite all your read-only transactions to use report queries instead of transac-tional, cached, and managed objects. Report queries are more verbose and (arguably) less object-oriented. They also make less efficient use of Hibernate’s caches, which is much more important once you consider the overhead of remote communication with the database in production systems. You should wait until you find a case where you have a real performance problem before using this optimization.

You can already create really complex HQL and JPA QL queries with what you’ve seen so far. Even more advanced queries may include nested statements, known as subselects.

Using subselects

An important and powerful feature of SQL is subselects. A subselect is a select query embedded in another query, usually in the SELECT, FROM, or WHERE clauses.

HQL and JPA QL support subqueries in the WHERE clause. Subselects in the FROM clause aren’t supported by HQL and JPA QL (although the specification lists them as a possible future extension) because both languages have no transitive closure. The result of a query may not be tabular, so it can’t be reused for selection in a

FROM clause. Subselects in the SELECT clause are also not supported in the query language, but can be mapped to properties with a formula.

(Some platforms supported by Hibernate don’t implement SQL subselects. Hibernate supports subselects only if the SQL database management system provides this feature.)

Correlated and uncorrelated nesting

The result of a subquery may contain either a single row or multiple rows. Typically, subqueries that return single rows perform aggregation. The following sub-query returns the total number of items sold by a user; the outer query returns all users who have sold more than 10 items:

tmp1DC230_thumb

This is a correlated subquery—it refers to an alias (u) from the outer query The next subquery is an uncorrelated subquery:

tmp1DC231_thumb

The subquery in this example returns the maximum bid amount in the entire system; the outer query returns all bids whose amount is within one (dollar) of that amount.

Note that in both cases, the subquery is enclosed in parentheses. This is always required.

Uncorrelated subqueries are harmless, and there is no reason to not use them when convenient, although they can always be rewritten as two queries (they don’t reference each other). You should think more carefully about the performance impact of correlated subqueries. On a mature database, the performance cost of a simple correlated subquery is similar to the cost of a join. However, it isn’t necessarily possible to rewrite a correlated subquery using several separate queries.

Quantification

If a subquery returns multiple rows, it’s combined with quantification. ANSI SQL, HQL, and JPA QL define the following quantifiers:

■ ALL—The expression evaluates to true if the comparison is true for all values in the result of the subquery. It evaluates to false if a single value of the subquery result fails the comparison test.

■ ANY—The expression evaluates to true if the comparison is true for some (any) value in the result of the subquery. If the subquery result is empty or no value satisfies the comparison, it evaluates to false. The keyword SOME is a synonym for ANY.

■ IN—This binary comparison operator can compare a list of values against the result of a subquery and evaluates to true if all values are found in the result.

For example, this query returns items where all bids are less than 100:

tmp1DC232_thumb

The next query returns all the others, items with bids greater than 100:

tmp1DC233_thumb

This query returns items with a bid of exactly 100:

tmp1DC234_thumb

So does this one:

tmp1DC235_thumb

HQL supports a shortcut syntax for subqueries that operate on elements or indices of a collection. The following query uses the special HQL elements() function:

tmp1DC-236_thumb

The query returns all categories to which the item belongs and is equivalent to the following HQL (and valid JPA QL), where the subquery is more explicit:

tmp1DC-237_thumb

Along with elements(), HQL provides indices(), maxelement(), minelement(), maxindex(), minindex(), and size(), each of which is equivalent to a certain correlated subquery against the passed collection. Refer to the Hibernate documentation for more information about these special functions; they’re rarely used.

Subqueries are an advanced technique; you should question frequent use of subqueries because queries with subqueries can often be rewritten using only joins and aggregation. However, they’re powerful and useful from time to time.

Summary

You’re now able to write a wide variety of queries in HQL and JPA QL. You learned in this topic how to prepare and execute queries, and how to bind parameters. We’ve shown you restriction, projection, joins, subselects, and many other options that you probably already know from SQL.

Table 14.4 shows a summary you can use to compare native Hibernate features and Java Persistence.

Table 14.4 Hibernate and JPA comparison chart

Hibernate Core

Java Persistence and EJB 3.0

Hibernate APIs support query execution with listing, iteration, and scrolling.

Java Persistence standardizes query execution with listing.

Hibernate supports named and positional query bind parameters.

Java Persistence standardizes named and positional bind parameter options.

Hibernate query APIs support application-level query hints.

Java Persistence allows developers to supply arbitrary vendor-specific (Hibernate) query hints.

HQL supports SQL-like restriction, projection, joins, subselects, and function calls.

JPA QL supports SQL-like restriction, projection, joins, subselects, and function calls—subset of HQL.

.

In the next topic we focus on more advanced query techniques, such as programmatic generation of complex queries with the Criteria API and embedding of native SQL queries. We’ll also talk about the query cache and when you should enable it.

Next post:

Previous post: