Databases Reference
In-Depth Information
The Sort has also been exchanged for a Hash Match. These new additional operators may be more costly,
but use almost no I/O, hence the decrease in duration. More importantly, the optimizer actually had
the chance to alter the implementation of how the query would be executed. When we code specific
implementations using derived tables and correlated subqueries, we leave the optimizer no alternatives
to evaluate and we denigrate the declarative nature of T-SQL. As SQL Server advances, you are going
to see more instances of where the old-school methods of rearranging SQL statements become irrelevant
outside of very narrow edge-case circumstances. In this case, it appears that the Join method is easy to
code and provides rather consistent performance of speed under increasing scale.
Paging and Sorting
Most applications have a need to page through a sorted resultset to display page N of M. There are several
methods of meeting this requirement and evaluating them all is a dense enough topic to warrant its own
chapter. Considerations for the perfect solution depend upon the selectivity options in the WHERE
predicate, the number of results that the predicate yields, the number and nature of the ORDER BY
scenarios allowed, whether total predicate realization as a total count is required, and finally the ease of
maintainability. Since you are reading this chapter, you're probably most interested in this topic from a
perspective of page-read-based performance.
The core of the problem is to apply a predicate to a set of tables and retrieve only a subset of the entire
resultset ordered in a specified sort. For server-based solutions, the key is having some way to pull out
the rows that constitute a viewable page by enumerating the resultset in the way it is ordered. There are
two approaches to uniquely identifying the rows returned and breaking the ties: applying a numbering
function to the resultset or using a unique identifying index from the results. For the results to be
consistent between incremental page calls, having the primary identity key as the tiebreaker is
convenient because it is intimately tied to each row. You can do the same thing with an applied function,
but pay attention to how the resultset is ordered by sorting finally by the identity key. If you don't have
a unique identifying key, then the only option is to use the ranking functions and deal with the results.
This is equivalent to searching for T-SQL in an instantaneously indexed search engine and getting slightly
different results each time. The variant results depend on how many new articles on T-SQL were added
since your last search.
To follow along with the explanations for paging and sorting, you'll need to run this bit of T-SQL to
create the TRANS_DATA table (which will contain 1 million row entries):
CREATE TABLE TRANS_TABLE(
MYID INT IDENTITY(1,1) NOT NULL PRIMARY KEY,
MYDESC VARCHAR(10),
MYDATE DATETIME,
MYGROUPID INT)
DECLARE @I INT
SET @I = 0
WHILE @I < 1000000
BEGIN
INSERT INTO TRANS_TABLE
SELECT CHAR(ASCII('A') - 2 + (2 * (1 + ABS(CHECKSUM(NEWID())) % 26))),
DATEADD(day, ABS(CHECKSUM(NEWID())) % 365, '01/01/2007'),
(ABS(CHECKSUM(NEWID())) % 10)
SET @I = @I + 1
END
Search WWH ::




Custom Search