Wednesday, April 04, 2007

ORDER BY and FETCH FIRST in Subselects [DB2 9 for z/OS]

Here is another quick post in my series on new features in DB2 9 for z/OS.

Today, we will look at the additional flexibility gained in how the ORDER BY and FETCH FIRST n ROWS ONLY clauses can be specified in V9. Prior to the V9, the only place you could specify these clauses was at the statement level. Indeed, this has been a source of confusion for many DB2 SQL programmers.

A discussion of DB2 SELECT should be broken down into three topics:
  1. fullselect,
  2. subselect, and
  3. select-statement.
The select-statement is the form of a query that can be directly specified in a DECLARE CURSOR statement, or prepared and then referenced in a DECLARE CURSOR statement. It is the thing most people think of when they think of SELECT in all its glory. If so desired, it can be issued interactively using SPUFI. The select-statement consists of a fullselect, and any of the following optional clauses: order-by, fetch-first, update, read-only, optimize-for, isolation and queryno. Well, that is, until V9 which still allows the fetch-first and order-by at this level, but also at the fullselect and subselect level!

A fullselect can be part of a select-statement, a CREATE VIEW statement, or an INSERT statement. Basically, a fullselect specifies a result table. Prior to V9, this sometimes confused folks as they tried to put a FETCH FIRST n ROWS clause or an ORDER BY in a view or as part of an INSERT. That was not allowed!

Finally, a subselect is a component of the fullselect. A subselect specifies a result table derived from the result of its first FROM clause. The derivation can be described as a sequence of operations in which the result of each operation is input for the next.

This is all a bit confusing. Think of it this way: in a subselect you specify the FROM to get the tables, the WHERE to get the conditions, GROUP BY to get aggregation, HAVING to get the conditions on the aggregated data, and the SELECT clause to get the actual columns. In a fullselect you add in the UNION to combine subselects and other fullselects. Finally, you add on any optional order-by, fetch-first, update, read-only, optimize-for, isolation and queryno clauses to get the select-statement.

But, of course, as of V9 you can use the order-by and/or the fetch-first at the subselect or fullselect level. This can be useful if you want to limit the results within a subquery or part of a UNION statement. Also, if you specify ORDER BY with FETCH FIRST n ROWS ONLY, the result is ordered before the fetch-first is applied. (That's a good thing.)

So, that means all of the following are now legal SQL formulations in V9:

(SELECT COL1 FROM T1
UNION
SELECT COL1 FROM T2
ORDER BY 1)
UNION
SELECT COL1 FROM T3
ORDER BY 1;


This example shows how ORDER BY can be applied in fullselects with UNION.

SELECT EMP_ACT.EMPNO, PROJNO
FROM EMP_ACT
WHERE EMP_ACT.EMPNO IN (SELECT EMPLOYEE.EMPNO
FROM EMP
ORDER BY SALARY DESC
FETCH FIRST 10 ROWS ONLY);


And this example will return the employee number and project number for any projects assigned to employees with one of the top ten salaries.

So, once you move to V9 you will have much more lattitude in how you write your SELECTs. If you are interested in more details, here is a link to the section of the DB2 9 for z/OS SQL Reference manual on building SQL queries.

No comments: