Friday, January 27, 2006

Adding Column Names to an Unload File

I received an e-mail from a reader asking an interesting question. She wanted to know if any of the DB2 unload utilities are able to include the column names in the same file as the unload output data. This was a requirement because one of the applications her company was planning to use takes the column headings and dynamically creates tables.

My initial reaction was that she could write a program that take the output file and reads the system catalog to gather the appropriate column names. That would work, but she actually came up with a better solution.

She used a third party Unload utility (but it would also have worked with IBM's Unload, too) to perform two SELECT statements. The first just creates the column headers and the second for the actual data. The column and data goes to two separate datasets, but they used IDCAMS to concatenate the 2 separate column/data sets into 1 dataset.

The SQL is simple, something like as follows:

SELECT 'COLUMN NAME 1', 'COLUMN NAME 2',
'COLUMN NAME 3', 'COLUMN NAME 4'
FROM SYSIBM.SYSDUMMY1

SELECT COL1, COL2, COL3, COL4
FROM XXXXXX.TABLE1

Of course, you'd just plug in the correct column names in the literals of the first SELECT statement -- and the correct column names in the second.

Saturday, January 21, 2006

My DB2 Articles

I've written a lot about DB2 over the years and I try to keep most everything I've written up and available over the web. If you are ever interested in finding a DB2 article of mine that you've read but can't find try the following link:

http://www.craigsmullins.com/art_db2.htm

That page contains links to all of the DB2 articles that I've written. It is in reverse chronological order...

Sunday, January 15, 2006

Design Guidelines for High Performance and Availability

Just a short blog entry this week to alert everyone to a newly published IBM redbook titled "DB2 UDB for z/OS: Design Guidelines for High Performance and Availability." This redbook is an essential read for anyone who is interested in squeezing the highest levels of availability and optimum performance from a mainframe DB2 application.

In just under 500 pages the authors share a wealth of tuning techniques that all DB2 shops should have access to. Download the manual (SG24-7134) for free from the web by clicking here.

Monday, January 09, 2006

Basic DB2 Buffering and Memory Guidelines

One of the most important areas for tuning DB2 subsystem performance is memory usage. DB2 for z/OS uses memory for buffer pools, the EDM pool, RID pool and sort pools to cache data and structures in memory. The better memory is allocated to these structures, the better DB2 will perform.

When allocating DB2 buffer pools, keep these rules of thumb in mind:

  • Don't allocate everything to a single buffer pool (e.g., BP0); use a multiple buffer pool strategy.
  • Explicitly specify a buffer pool for every table space and index.
  • Isolate the DB2 Catalog in BP0; put user and application DB2 objects into other buffer pools.
  • Consider separating indexes from table spaces with each in their own dedicated buffer pools.
  • Consider isolating heavily hit data into its own buffer pool to better control performance.
  • Consider isolating sorts into a single buffer pool and tuning for mostly sequential access (e.g. BP7).
  • Consider separating DB2 objects into separate buffer pools that have been configured for sequential verses random access.
Forget about trying to follow a cookie-cutter approach to buffer pool management. Every shop must create and optimize a buffer pool strategy for its own data and application mix. DB2 offers the following buffer pool tuning "knobs" that can be used to configure virutal buffer pools to the type of processing they support:

DWQT –this value is the deferred write threshold; it is expressed as a percentage of the virtual buffer pool that might be occupied by unavailable pages. When this threshold is reached DB2 will start to schedule write I/Os to externalize data. The default is 50%, which is likely to be too high for most shops.

VDWQT – this value is the vertical deferred write threshold; it is basically the same as DWQT, but for individual data sets. The default is 10%, which once again is quite likely to be too high for many shops.

VPSEQT – this value is the sequential steal threshold; it is a expressed as a percentage of the virtual buffer pool that can be occupied by sequentially accessed pages. Tune buffer pools for sequential access such as scans and sorting by modifying VPSEQT to a larger value. The default is 80%.

VPPSEQT – this value is the sequential steal threshold for parallel operations; the default value is 50%.

VPXPSEQT – this value is assisting parallel sequential threshold; it is basically the VPPSEQT for opertaions from another DB2 subsystem in the data sharing group.

These parameters can be changed using the ALTER BUFFERPOOL command. Additionally, prior to DB2 V8, hiperpools can be created to back up DB2 virtual buffer pools with additional memory. DB2 provides several tuning knobs for hiperpools, too, including HPSIZE to adjust the size of hiperpools and HPSEQT to adjust the hiperpool sequential steal threshold. Hiperpools are obsolete as of V8 though, so if you don't use them today you should probably spend your time migrating to V8 instead of implementing soon-to-be-obsolete hiperpools.

With the advent of DB2 V8, there is more memory at your disposal for DB2's use. V8 is able to surmount the limitation of 2GB real storage that was imposed due to S/390's 31-bit addressing. Theoretically, with 64-bit addressing DB2 could have up to 16 exabytes of virtual storage addressability to be used by a single DB2 address space. Now there is some room for growth!

In addition to buffer pools, DB2 uses memory for the EDM pool. The EDM pool is used for caching internal structures used by DB2 programs. This includes DBDs, SKCTs, CTs, SKPTs, and PTs. It also includes the authorization cache for plans and packages, as well as the cache for dynamic SQL mini-plans.

With V8, DB2 breaks the EDM pool into separate pools: one for DBDs, one for the dynamic statement cache, and the final one for program elements (CTs, SKCTs, PTs, SKPTs).

As a general rule of thumb, shoot for an 80 percent hit rate with the EDM pool; this means that only one out every five times should a structure need to be loaded from disk into the EDM pool.

Finally, remember that buffer and EDM pool tuning are in-depth subjects that cannot be adequately covered in a high-level blog entry such as this. So, study those IBM DB2 manuals - and learn by doing. Additionally, there is much more to proper DB2 system performance tuning than memory tuning. Other system elements requiring attention include allied agent setup (CICS, TSO, etc.), network configuration, locking, logging, and Parallel Sysplex configuration and management for DB2 data-sharing shops.