Wednesday, September 18, 2013

Using the DISPLAY Command, Part 1

Often times some of the simplest commands and features of DB2 can be used to great effect -- if you just remember that they are there and learn how to use them. The DISPLAY Command is one of those features.

Although a DB2 performance monitor is probably the best solution for gathering information about your DB2 subsystems and databases, you can gain significant insight into “what is going on out there” using the simple DISPLAY command.

In today's blog post, and a series of subsequent posts, we will learn about the DISPLAY command and how it can be used to monitor various aspects of DB2.

There are eight variations of the DISPLAY command that you can utilize, depending on the type of information you are looking for. The DISPLAY command can be used to return information about the status of DB2 data sharing groups, databases and table spaces, threads, stored procedures, user-defined functions, utilities, and traces; it can also monitor the Resource Limit Facility (RLF) and distributed data locations.

Let’s start with coverage of what is probably the most often-used variation of the DISPLAY command, looking into database information.

By running the DISPLAY DATABASE command, you can gather information on DB2 databases and table spaces. The output of the basic command will show the status of the objects specified along with any exception states that apply. For example:

-DISPLAY DATABASE(DBNAME)

Issuing this command will display details on the DBNAME database including information about the tablespaces and indexes in that database. So, with a simple command you can easily find all of the tablespaces and indexes within any database —  pretty powerful stuff. But the status information for each space is useful, too. When a status other than RO or RW is encountered, the object is in an indeterminate state or is being processed by a DB2 utility. The possible statuses that DB2 can assign to a page set are detailed here:

ARBDP
Index is in Advisory Rebuild Pending status; the index should be rebuilt to improve performance and allow the index to be used for index-only access again.
AREO*
The table space, index, or partition is in Advisory Reorg Pending status; the object should be reorganized to improve performance. This status is new as of DB2 V8.
ACHKP
The Auxiliary Check Pending status has been set for the base table space. An error exists in the LOB column of the base table space.
AREST
The table space, index space, or partition is in Advisory Restart Pending status. If back-out activity against the object is not already underway, either issue the RECOVER POSTPONED command or recycle the specifying LBACKOUT=AUTO.
AUXW
Either the base table space or the LOB table space is in the Auxiliary Warning status. This warning status indicates an error in the LOB column of the base table space or an invalid LOB in the LOB table space.
CHKP
The Check Pending status has been set for this table space or partition.
COPY
The Copy Pending flag has been set for this table space or partition.
DEFER
Deferred restart is required for the object.
GRECP
The table space, table space partition, index, index partition, or logical index partition is in the group buffer pool Recover Pending state.
ICOPY
The index is in Informational Copy Pending status.
INDBT
In-doubt processing is required for the object.
LPL
The table space, table space partition, index, index partition, or logical index partition has logical page errors.
LSTOP
The logical partition of a non-partitioning index is stopped.
PSRBD
The entire non-partitioning index space is in Page Set Rebuild Pending status.
OPENF
The table space, table space partition, index, index partition, or logical index partition had an open data set failure.
PSRCP
Indicates Page Set Recover Pending state for an index (non-partitioning indexes).
PSRBD
The non-partitioning index space is in a Page Set Rebuild Pending status.
RBDP
The physical or logical index partition is in the Rebuild Pending status.
RBDP*
The logical partition of a non-partitioning index is in the Rebuild Pending status, and the entire index is inaccessible to SQL applications. However, only the logical partition needs to be rebuilt.
RECP
The Recover Pending flag has been set for this table space, table space partition, index, index partition, or logical index partition.
REFP
The table space, index space, or index is in Refresh Pending status.
RELDP
The object has a release dependency.
REORP
The data partition is in a REORG Pending state.
REST
Restart processing has been initiated for the table space, table space partition, index, index partition, or logical index partition.
RESTP
The table space or index is in the Restart Pending status.
RO
The table space, tables pace partition, index, index partition, or logical index partition has been started for read-only processing.
RW
The table space, table space partition, index, index partition, or logical index partition has been started for read and write processing.
STOP
The table space, table space partition, index, index partition, or logical index partition has been stopped.
STOPE
The table space or index is stopped because of an invalid log RBA or LRSN in one of its pages.
STOPP
A stop is pending for the table space, table space partition, index, index partition, or logical index partition.
UT
The table space, table space partition, index, index partition, or logical index partition has been started for the execution of utilities only.
UTRO
The table space, table space partition, index, index partition, or logical index partition has been started for RW processing, but only RO processing is enabled because a utility is in progress for that object.
UTRW
The table space, table space partition, index, index partition, or logical index partition has been started for RW processing, and a utility is in progress for that object.
UTUT
The table space, table space partition, index, index partition, or logical index partition has been started for RW processing, but only UT processing is enabled because a utility is in progress for that object.
WEPR
Write error page range information.

Of course, there are many additional options that can be used in conjunction with the DISPLAY DATABASE command. The following options can be used to narrow down the amount of information displayed:

  • USE displays what processes are using resources for the page sets in the database 
  • CLAIMERS displays the claims on the page sets in the database
  • LOCKS displays the locks held on the page sets in the database
  • LPL displays the logical page list entries
  • WEPR displays the write error page range information.


Additionally, for partitioned page sets, you can specify which partition, or range of partitions, that you wish to display.

The OVERVIEW option can be specified to display each object in the database on its own line. This condenses the output of the command and makes it easier to view. The OVERVIEW keyword cannot be specified with any other keywords except SPACENAM, LIMIT, and AFTER.

Another tactic that can be used to control the amount of output generated by DISPLAY DATABASE is to use the LIMIT parameter. The default number of lines returned by the DISPLAY command is 50, but the LIMIT parameter can be used to set the maximum number of lines returned to any numeric value. For example:

-DISPLAY DATABASE(DBNAME) LIMIT(300)

Using the LIMIT parameter in this manner would increase the limit to 200 lines of output. To indicate no limit, you can replace the numeric limit with an asterisk (*).

Finally, you can choose to display only objects in restricted or advisory status using either the ADVISORY or RESTRICT key word.



And be sure to tune in for the next blog post which will look into using DISPLAY to uncover information about your DB2 buffer pools.

Wednesday, September 11, 2013

Answering a Question: Dealing with Different Data Types

I get a lot of e-mail. Some of it is junk, but some of it contains questions on all sorts of issues. I cannot answer everything that comes into my in box or I wouldn't have time to earn a living. But every now and then I will answer a question here on the blog.

Today's question is:


Dear Mr. Mullins, 
I have a question I hope you can help me with.  In DB2 I am attempting to compare two columns, however one column is data type "Integer" and the other is data type "Decimal".  Can you give me an idea on how to convert the data types so they can be compared?




And here is my answer:
You can cast one data type to the other. For example, to cast the decimal to an integer you would use the INTEGER function, like so:

           WHERE INTEGER(dec_col) = int_col

Or you could cast the integer column to a decimal like this:

           WHERE DECIMAL(int_col) = dec_col
Alternately, you can use CAST to cast either column to either data type. For example, CAST(dec_col AS INTEGER)…  Hope this helps…


Friday, September 06, 2013

Top Ten Most Pervasive Myths About DB2 for z/OS

Today's blog offers up yet another Top Ten list for DB2 users, perusers, and abusers... This time counting down the most common myths that are perpetrated "out there" regarding DB2 and how it works (or doesn't work)...


1.Use Views to Insulate Programs from Change

              This lie has been told for almost as long as DB2 has been around. I first wrote about this way back in 1991 for Database Programming & Design. Check that article out here if you don't understand why this is a bad idea, in general. 

2.Locking Problems Indicate a Database Problem

              Locking problems are generally caused by bad program design. You should write code to reduce the duration of locks and to COMMIT regularly... and then locking won't be a problem, for the most part.

3.Primary Key is Usually a Good Choice for Clustering

              Actually, the foreign key is likely to be a better choice. When you join PK-->FK there will be one PK to multiple FK. Wouldn't it be best if the FKs were clustered on the same page (or pages)?

4.Just Using the Defaults Should Work Out Well

              Don't rely on defaults. Many of them are outdated or wrong... and even if they aren't it will be better if you review and investigate all options before explicitly specifying the parameter value you want. 

5.Programmers Don’t Need to Know How to Tune SQL

              Programming performance-oriented SQL into your programs from the beginning would go a long way toward improving performance overall... and reducing the length of the application development lifecycle. 

6.Black Boxes Work Well for Performance

              No they don't!!!

7.Using NULLs Can Save Space

              No they can't!!!

8.RUNSTATS Aren’t That Important

              If you don't work with up-to-date RUNSTATS then you are hobbling the DB2 Optimizer and almost assuredly getting sub-optimal access paths for your DB2 SQL. 


9.DB2 is a Hog

              If you don't use it properly, every piece of software can become a resource hog. If you acquire the knowledge on how to work properly with DB2 then it will hum along like a well-oiled machine!

10.It Depends!

              This is the answer that can be used for every DB2 question. But if that answer is not followed up with what "it" depends upon, then it is a useless answer... and whoever gave you that answer is probably just trying to get rid of you instead of helping you with your problems.

Sunday, September 01, 2013

Top Ten Common SQL Mistakes (with DB2 for z/OS)

There are many different types of development mistakes that can be made when you are coding a complex application system. But probably the most pervasive mistakes being made when developing an application using DB2 for z/OS are SQL mistakes... well, actually, that statement is probably true regardless of the DBMS, but I am going to focus on DB2 for z/OS given that this is a blog that focuses on DB2 and mainframe issues.

  1. Assuming an ORDER BY is not necessary for ordered results - if you want your results always to be returned in a specific order, you must include the ORDER BY clause. If you fail to do so, a future rebind can change the access path and also possibly change the order of your results set.
  2. Forgetting the NULL indicator - when your query can return a null, but sure to include a null indicator or you will get a -305 SQLCODE (22002 SQLSTATE). And be sure to check the null indicator to see if the data is null or not!
  3. Incorrect expectations when using the NOT IN predicate with NULLs - Nulls can be confusing when using the IN predicate. For example, what if we want to find all Colors who are not assigned to a particular Product using a query like shown below. THe problem arises when the P.color can be NULL. The NULL causes the predicate to be UNKNOWN so the results set is always empty. 
           SELECT C.color
           
FROM   Colors AS C
           
WHERE  C.color NOT IN (SELECT P.color
                                  
FROM Products AS P);

  1. Coding predicates appropriately in Outer Joins - Sometimes it can be difficult to figure out how to code outer joins appropriately if you don't code them very often. Terry Purcell has written a nice article on the topic that can be found here.
  2. Not coding a cursor for a multi-row result - When more than one row can be returned by your query you must use a cursor (or specify FETCH FIRST 1 ROW ONLY)
  3. Recompiling but not binding - if you make changes to the SQL, you have to BIND it again. Otherwise it won't work! 
  4. Forgetting to use single quotes around strings (instead of double quotes) - DB2 SQL expects single quotes around character strings.
  5. Trying to modify a Primary Key column - you cannot modify a primary key. A primary key should be unique within the table and immutable.
  6. Forcing dynamic SQL into static SQL (sometimes hundreds of static SQL statements) - you should analyze the type and nature of your database queries to determine whether they should be static or dynamic. Consider using the advice here (static SQL v. dynamic SQL) to guide you.
  7. Asking for more data than you need (columns and/or rows) - specify only the columns that you actually need in your SELECT-list... and use WHERE clauses (predicates) to filter the data to just that data that you need before bringing it into the program. The lesser the amount of data that DB2 needs to transfer from the database to your program, the more efficient things will be!

Thursday, August 22, 2013

NoSQL Gets Me Thinking About ACID

This week I attended the NoSQL Now Conference in San Jose, California. This conference focused on NoSQL technology and implementations and a LOT of the discussion focused on transactions and whether or not Big Data and NoSQL databases were at a disadvantage when it comes to their lack of support for ACID (mostly).

You can read all about my experience at this conference on my Data Technology Today blog in the following two posts:


At any rate, though, I got to thinking... and those of you who know me understand that that can be a dangerous thing. Basically, ACID is not a topic that relational folks sit around talking about. It is kind of taken for granted. So I thought it might be a good idea to reinforce the definition of ACID and why it is so important in DB2... and the relational world in general.

ACID is an acronym for atomicity, consistency, isolation, and durability. Each of these four qualities is necessary for a transaction to be designed correctly and deliver data integrity when complete:
  • Atomicity means that a transaction must exhibit “all or nothing” behavior. Either all of the instructions within the transaction happen, or none of them happen. Atomicity preserves the “completeness” of the business process.
  • Consistency refers to the state of the data both before and after the transaction is executed. A transaction maintains the consistency of the state of the data. In other words, after running a transaction, all data in the database is “correct.”
  • Isolation means that transactions can run at the same time. Any transactions running in parallel have the illusion that there is no concurrency. In other words, it appears that the system is running only a single transaction at a time. No other concurrent transaction has visibility to the uncommitted database modifications made by any other transactions. To achieve isolation, a locking mechanism is required.
  • Durability refers to the impact of an outage or failure on a running transaction. A durable transaction will not impact the state of data if the transaction ends abnormally. The data will survive any failures.

Let’s use an example to better understand the importance of ACID transactions to relational database applications. Consider a banking application. Assume that you wish to withdraw $50 from your account with your bank. This business process requires a transaction to be executed. You request the money either in person by handing a slip to a bank teller or by using an ATM. When the bank receives the request, it performs the following tasks, which make up the complete business process. The bank will

  1. Check your account to make sure you have the necessary funds to withdraw the requested amount.
  2. If you do not, deny the request and stop; otherwise continue processing.
  3. Debit the requested amount from your checking account.
  4. Produce a receipt for the transaction.
  5. Deliver the requested amount and the receipt to you.

The transaction that is run to perform the withdrawal must complete all of these steps, or none of these steps, or else one of the parties in the transaction will be dissatisfied. If the bank debits your account but does not give you your money, then you will not be satisfied. If the bank gives you the money but does not debit the account, the bank will be unhappy. Only the completion of every one of these steps results in a “complete business process.” Database developers must understand the requisite business processes and design transactions that ensure ACID properties.

Unit of work (UOW) is another transaction term that describes a physical transaction. A UOW is a series of instructions and messages that, when executed, guarantees data integrity. So a UOW and a transaction are similar in concept. However, a UOW is not necessarily a complete business process—it can be a subset of the business process, and a group of units of work can constitute a single transaction. Each UOW must possess ACID characteristics. In other words, if the transaction were to fail, the state of the data upon failure must be consistent in terms of the business requirements.

To summarize, a transaction—when executed alone, on a consistent database—will either complete, producing correct results, or terminate, with no effect. In either case the resulting condition of the database will be a consistent state.

Hopefully after this discussion it is simple to see why relational databases—with many related tables—rely on ACID properties of transactions to maintain consistency. Of course, the NoSQL world has different use cases and, arguably, can get by with eventual consistency... that is, without ACID. But that is another topic that is, frankly, beyond the scope of this DB2/mainframe-focused blog. 

Hopefully, though, this review of ACID and its importance to data consistency was helpful.