Showing posts with label best practices. Show all posts
Showing posts with label best practices. Show all posts

Tuesday, June 10, 2014

ORDER BY an Expression

Sometimes a program requires that the results of a query be returned in a specific sequence. We all know that the ORDER BY clause can be used to sort SQL results into a specific order. For example, to return a sorted list of employee compensation sorted by last name we could write:

SELECT LASTNAME, 
       FIRSTNAME, 
       SALARY+COMM+BONUS
FROM   EMP
ORDER BY LASTNAME;

But what if we need to sort it by total compensation? There are two approaches that work here: position number and column renaming. Using position number the ORDER BY clause becomes:

SELECT LASTNAME, 
       FIRSTNAME, 
       SALARY+COMM+BONUS
FROM   EMP
ORDER BY 3;

This will cause DB2 to sort by the third element in the SELECT-list, in this case the total compensation expression. But what if we add another column at the beginning of the SELECT-list? Or what if we need to port the SQL to a different database with different standards? Well, in that case we can use column renaming:

SELECT LASTNAME, 
       FIRSTNAME, 
       SALARY+COMM+BONUS AS TOTAL_COMP
FROM   EMP
ORDER BY TOTAL_COMP;

This method is preferred for a number of reasons:

  • it will continue to work even if the SQL statement is changed
  • it gives the expression a name making it more self-documenting
  • it should be more portable

Monday, June 02, 2014

Don't Neglect Your DB2 Rebind Strategy

We’re all busy. Frequently it can seem like you just got in to the office and already it is past quitting time! There is so much to do and so little time to do it all. And we all work more than 40 hours a week… these are some of the common complaints of the busy DBA.

And those are valid concerns, but it does not diminish the need to properly address DB2 database administration and performance management... with a special focus on proactive management. 

So please take a little bit of time to read about, and consider your organization's strategy for rebinding DB2 applications.

REBIND Strategy

One of the most important contributors to the on-going efficiency and health of your DB2 environment is proper management of DB2 access path changes. A thorough REBIND management process is a requirement for healthy DB2 applications.

But many shops do not do everything possible to keep access paths up-to-date with the current state of their data. Approaches vary, such as rebinding only when a new version of DB2 is installed, whenever PTFs are applied to DB2, or to rebind automatically after a regular period of time. Although these methods are workable, they are less than optimal. 

The worst approach though is the “if it ain’t broke don’t fix it” mentality. In other words, many DBA groups adopt “never REBIND unless you absolutely have to” as a firm policy. The biggest problem this creates is that it penalizes every program in your subsystem for fear of a few degraded access paths. This results in potentially many programs having sub-optimal performance because the optimizer never gets a chance to create better access paths as the data and environment change. Of course, the possibility of degraded performance after a REBIND is real – and that is why many sites avoid regularly rebinding their programs.

Even so, the best approach is to perform regular REBINDs as your data changes. To do so, you should follow the Three R’s. Regularly reorganizing to ensure optimal structure; followed by RUNSTATS to ensure that the reorganized state of the data is reflected in the DB2 Catalog; and finally, rebinding all of programs that access the reorganized structures. This technique can improve application performance because access paths will be better designed based on an accurate view of your data.

Of course, adopting the Three R’s approach raises questions, such as “When should you reorganize?” To properly determine when to reorganize you’ll have to examine statistics. This means looking at either RUNSTATS in the catalog or Real Time Statistics (RTS). So, the Three R’s become the Four 4 R’s – examine the Real Time Stats, REORG database objects as indicated by RTS, RUNSTATS to get the new statistics, then REBIND any impacted application programs.

Some organizations do not rely on statistics to schedule REORGs. Instead, they build REORG  JCL as they create each object – that is, create a table space, build and schedule a REORG job, and run it monthly or quarterly. This is better than no REORG at all, but it is not ideal because you are likely to be reorganizing too soon (wasting CPU cycles) or too late (causing performance degradation until REORG).

It is better to base your REORGs off of thresholds on catalog or real-time statistics. Statistics are the fuel that makes the optimizer function properly. Without accurate statistics the optimizer cannot formulate the best access path to retrieve your data because it does not know how your data is currently structured. So when should you run RUNSTATS? One answer is “as frequently as possible based on how often your data changes.” To succeed you need an understanding of data growth patterns – and these patterns will differ for every table space and index.

The looming question is this: why are we running all of these RUNSTATS and REORGs? To improve performance, right? But only with regular REBINDs will your programs take advantage of the new statistics to build more efficient access paths, at least for static SQL applications.

Without an automated method of comparing and contrasting access paths, DB2 program change management can be time-consuming and error-prone – especially when we deal with thousands of programs. And we always have to be alert for a rogue access path – that is, when the optimizer formulates a new access path that performs worse than the previous access path.

Regular rebinding means that you must regularly review access paths and correct any “potential” problems. Indeed, the Four R’s become the Five R’s because we need to review the access paths after rebinding to make sure that there are no problems. So, we should begin with RTS (or RUNSTATS) to determine when to REORG. After reorganizing we should run RUNSTATS again, followed by a REBIND. Then we need that fifth R – which is to Review the access paths generated by the REBIND.

The review process involves finding which statements might perform worse than before. Ideally, the DBAs would review all access path changes to determine if they are better or worse. But DB2 does not provide any systematic means of doing that. There are tools that can help you achieve this though.

The bottom line is that DB2 shops should implement best practices whereby access paths are tested to compare the before and after impact of the optimizer’s choices. By adopting best practices to periodically REBIND your DB2 programs you can achieve better overall application performance because programs will be using access paths generated from statistics that more accurately represent the data. And by implementing a quality review step there should be less need to reactively tune application performance because there will be fewer access path and SQL-related performance problems.

Friday, July 26, 2013

Top Ten Most Common DB2 Performance Problems


  1. PEBCAK 

    The number one cause of DB2 performance problems today, as always, is Problem Exists Between Chair And Keyboard!
     
  2. Poorly coded SQL

    Many performance problems can be traced back to inappropriately coded SQL Code it correctly from the beginning and tune what is already out there.
         
  3. Improper indexing

    Optimize performance via indexing by workload, not by object.  
     
  4. Bad program design

    Coding DB2 SQL for Performance in your application programs from the outset can eliminate many future problems.

    .

  5. Bachelor programming syndrome

    Yes, the dreaded "Fear of Committing" can cause performance problems due to concurrency issues.
         
  6. Improperly defined buffer pools

    Defining effective buffer pools for your DB2 workload is important. There are many things you can do to identify the proper settings and sizing of your DB2 buffer pools.
         
  7. Index / table space needs to be reorganized

    Reorganization of indexes and table spaces can improve performance. Follow the Five R's to assure optimal DB2 application performance.
     
  8. Improperly designed database structures

    Designing database structures correctly -- from the beginning -- is the way to go for efficiency and efficacy.
     
  9. Copied code syndrome

     
  10. RUNSTATS not up-to-date (or not even run)

    How can you expect for the DB2 Optimizer to do its thing on your SQL without statistics about your data and environment? Again, The Five R's!

And if a Top Ten list does not provide enough detail for you, then splurge for a copy of my book: DB2 Developer's Guide, 6th edition. Recently updated for DB2 10 for z/OS, the book delivers over 1600 pages full of DB2 tips, tricks, guidelines, and details...  It rates 5 stars on Amazon!

Thursday, August 05, 2010

DB2 Best Practices

With today's blog entry I'm hoping to encourage some open-ended dialogue on best practices for DB2 database administration. Give the following questions some thought and if you've got something to share, post a comment!

What are the things that you do, or want to do, on a daily basis to manage your database infrastructure?

What things have you found to be most helpful to automate in administering your databases? Yes, I know that all the DBMS vendors are saying that they've created the "on demand" "lights-out" "24/7" database environment, but we all know that ain't so! So what have you done to automate (either using DBMS features, tools, or homegrown scripts) to keep an eye on things?

How have you ensured the recovery of your databases in the case of problems? Application problems? Against improper data entry or bad transactions? Disaster situations? And have you tested your disaster recovery plans? If so, how? And were they successful?

What type of auditing is done on your databases to track who has done what to what data? Do you audit all changes? To all applications, or just certain ones? Do you audit access, as well as modification? If so how?

How do you manage change? Do you use a change management tool or do it all by hand? Are database schema changes integrated with application changes? If so, how? If not, how do you coordinate things to keep the application synchronized with the databases?

What about DB2 storage management? Do you actively monitor disk usage of your DB2 table space and index spaces? Do you have alerts set so that you are notified if any object is nearing its maximum size? How about your VSAM data sets? Do you monitor extents and periodically consolidate? How do you do it... ALTER/REORG? Defrag utilities? Proactive defrag?

Is your performance management set up with triggers and farmed out to DBAs by exception or is it all reactive, with tuning tasks being done based on who complains the loudest?

Do you EXPLAIN every SQL statement before it goes into production? Does someone review the acess plans or are they just there to be reviewed in case of production performance problems? Do you rebind your programs periodically (for static SQL programs) as your data volume and statistics change, or do you just leave things alone until (or unless) someone complains?

When do you reorganize your data structures? On a pre-scheduled regular basis or based on database statistics? Or a combination of both? And how do you determine which are done using which method? What about your other DB2 utilities? Have you automated their scheduling or do you still manually build JCL?

How do you handle PTFs? Do you know which have been applied and which have not? And what impact that may be having on your database environment and applications? Do you have a standard for how often PTFs are applied?

How is security managed? Do the DBAs do all of the GRANTs and REVOKEs or is that job shared by security administrators? Are database logons coordinated across different DBMSs? Or could I have an operating system userid that is different from my SQL Server logon that is different than my Oracle logon -- with no capability of identifying that the user is the same user across the platforms?

How has regulatory compliance (e.g. PCI DSS, SOX, etc.) impacted your database administration activities? Have you had to purchase additional software to ensure compliance? How is compliance policed at your organization?

Just curious... Hope I get some responses!