The Db2 Portal Blog: RUNSTATS

Showing posts with label RUNSTATS. Show all posts

Thursday, November 09, 2023

Top Ten Db2 Performance Tips – No. 9 Regular Maintenance and Housekeeping

When thinking about the performance characteristics of your Db2 databases and applications keep in mind the routine maintenance that can help or hinder your performance. Regular maintenance and housekeeping tasks are essential for ensuring the optimal performance and stability of your Db2 environment.

By performing routine maintenance activities, database administrators can proactively address performance degradation, optimize query execution, and maintain a healthy database environment. In today’s post, we will highlight the importance of regular maintenance tasks and discuss key activities that contribute to database performance.

Index Maintenance

Indexes play a crucial role in query performance, as they facilitate quick data retrieval. Over time, indexes can become fragmented, leading to increased disk I/O and decreased query performance. Regularly reorganizing indexes helps eliminate fragmentation and enhances query execution efficiency. By scheduling index reorganization tasks based on the fragmentation level and database activity, administrators can maintain optimal index performance and minimize the impact of fragmentation on query response times.

You should also monitor index usage and consider removing any unused indexes. You can identify unused indexes relatively easily using the LASTUSED information in the RTS SYSINDEXSPACESTATS table in the Db2 Catalog.

Tablespace Reorganization

As your database grows and data is modified or deleted, storage space can become fragmented, leading to suboptimal performance. Regularly reorganizing database structures helps to consolidate data, reclaim unused space, and optimize storage allocation. Reorganizing your tablespaces can improve I/O performance, reduce disk fragmentation, and enhance overall system efficiency. By performing regular database reorganizations based on data growth patterns and workload characteristics, administrators can maintain a well-organized and performant database environment.

Up-to-Date Statistics

Accurate and up-to-date statistics are essential for the Db2 optimizer to make informed decisions on query execution plans. As data in the database changes, statistics need to be updated to reflect the current distribution of data. Regularly updating statistics (using the RUNSTATS utility) ensures that the optimizer has the most accurate information to generate optimal execution plans. By analyzing data distribution patterns and scheduling statistics updates accordingly, administrators can improve query performance and avoid suboptimal query plans caused by outdated statistics.

For packages using static SQL, taking advantage of updated statistics requires rebinding. However, you may not want to rebind every time you run RUNSTATS unless application performance is suffering.

Routine Backups

Regular backups are vital for data protection and disaster recovery. Performing routine database backups not only safeguards the integrity of the database but also contributes to performance optimization. In the event of a failure or data loss, having a recent backup minimizes the recovery time and ensures business continuity. DBAs should establish a backup schedule based on the criticality of the data, recovery time objectives (RTOs), and workload requirements.

And do not forget to regularly test your recovery plans and capabilities. Too often DBAs focus on backups at the expense of recovery… and backups are needed primarily to enable recovery, right?

Transaction Log Management

And let’s not forget the transaction logs! Logs play a critical role in ensuring data consistency and recoverability. Regularly monitoring and managing the transaction log space helps prevent log-related performance issues and ensures uninterrupted database operations. Activities such as transaction log backups, log file sizing, optimizing log offloading, and log file utilization monitoring are crucial for maintaining optimal transaction log performance and managing log space efficiently.

Buffer Pool Review

Regularly monitoring the efficiency of your Db2 buffer pools is important to ensure that you are achieving expected hit ratios and performance. As new applications are added, your amount of data increases, and access patterns change it will be necessary to adjust buffer pool sizes and parameters to optimize performance.

System Maintenance

Be sure to keep your Db2 subsystem updated with recent maintenance. You can use the -DIS GROUP command, even if you are not running data sharing, to display the current status of your Db2 software.

This command returns the message DSN7100I which is documented at https://www.ibm.com/docs/en/db2-for-zos/13?topic=messages-dsn7100i. It will show you the current version and function level, the current code level, and also the highest possible function level you can activate for your Db2 environment.

Be sure too to follow the IBM recommended preventive maintenance strategy to apply the appropriate maintenance using the IBM supplied RSU (Recommended Service Upgrade).

Summary

By incorporating these regular maintenance tasks into your database administration and management routine, DBAs and systems programmers can optimize performance, prevent performance degradation, and mitigate potential issues. Automation tools and scripts can streamline these maintenance activities and ensure consistency and timeliness in execution.

Thursday, December 17, 2020

Db2 Utilities and Modern Data Management

Db2 utilities are the unappreciated, and often over-looked, workhorses of your mainframe Db2 environment. They perform the dirty work that has to be done to populate, organize, backup, and recover your vital mainframe data. Without them, building effective Db2 databases, managing data, optimizing performance, and even accessing mainframe data would be a lot more difficult than it currently is.

The Situation

Think about the Db2 utility situation at your shop. If you are like most organizations you will have Db2 utilities regularly running all the time. There are load and unload tasks running to refresh data for development and testing, for moving data between environments for analysis and processing, and for various other purposes. The LOAD and UNLOAD utilities bear a lot of the hard work for data movement.

You are also most likely reorganizing data using a REORG utility for most of your Db2 table spaces and probably indexes, too. In many cases reorganization jobs are scheduled to run on a regular basis: weekly, monthly, quarterly, etc. Frequently you just set these jobs up when the object is created. The job gets scheduled and is just run without anybody taking a look at them unless, or until there are performance problems.

Then there are COPY and RECOVER utilities for backing up and recovering data when there are problems. The image copy backup jobs are running all the time, taking either full or incremental copies to ensure that you can recover data in case problems are encountered. The copies are running all the time, but the recover jobs (hopefully) are not running all the time!

You are also going to be running the RUNSTATS utility to gather statistics for Db2 to use for query optimization. Depending on how often your data changes, you may be running RUNSTATS frequently or infrequently. Many times the same fate as REORG befalls RUNSTATS… that is, it is scheduled and forgotten about unless problems arise.

There are other utilities, like CHECK which is used to verify the integrity of data. You are probably not running this one very often but when you need it you want it to run fast, right?

So, all of these utilities are “out there” running and consuming CPU to move, copy, and manage your Db2 data. But are they being run effectively as possible?

Moving to the Modern Db2 Utility Way

I think by this point everybody will agree that utility type processing is not just critical, but mandatory for a Db2 environment. But just running with the bare basics is not the best approach.

If we think about data movement with unload and load processing there are several things that you might want to consider for improvement. First of all, consider the speed and performance of the unload and load tasks. You probably want these jobs to run as fast as possible – that is, to consume as little elapsed time as possible to complete. After all, you are probably using these utilities to build environments or even refresh portions of an environment… and there will be developers and testers waiting to use that data as soon as it is available. Using the fastest utility programs available will minimize the wait time and make your developers and testers more productive. Furthermore, you want these tasks to consume as little CPU as possible to reduce your monthly mainframe bills!

In some cases you might want to re-consider unloading and loading altogether, using alternate utilities and offerings that can clone an entire subsystem or move data outside the control of Db2 at the data set level.

If we think about reorganization, it is likely that you are running REORG tasks that don’t need to be run, at least not as regularly as they are being run. At the same time, it is also likely that you are not running other REORG tasks as frequently as you should, thereby causing every other task that accessing the data to degrade. Fortunately, you can use RTS (real time statistics) to help guide when you should (and should not) reorganize your data. In the best case the utility itself relies on RTS to figure out if it needs to run and runs when it makes sense only. Failing this, you are again likely consuming more CPU than is necessary (either running unneeded REORGs or accessing poorly organized data, as the case may be).

If you think about your backup and recovery situation, the issue is likely complexity. Sure you want COPY and RECOVER utilities that run fast and consume minimal CPU, but the big issue is analysis. By that I mean, when you need to recover you want to make sure that you can use the image copies (and, of course, the log) to recover and meet your RTOs (recovery time objectives). But creating recover jobs on-the-fly, in a probably complicated environment with inter-related tables and data, can be difficult. And doing so when there is an outage, which is usually the case, exacerbates the situation. Using intelligent utilities to create the right image copies and to automatically build an appropriate recovery strategy when needed should be the modern approach.

And not to neglect RUNSTATS and CHECK, you want both of those utilities to run as fast as possible, consuming minimal CPU, too. And you want guidance on when and how to run them using available RTS, statistics, and any system information available.

What Can You Do?

One approach is to use modern utilities, not only built for speed but that incorporate AI and machine learning to automate and improve the Db2 utility experience. BMC Software is once again on the vanguard with its BMC AMI utilities for Db2.

The first question you probably have is "What the heck is AMI?" Well, AMI, which stands for Automated Mainframe Intelligence, is technology that is being infused into BMC’s product line to leverage AI, machine learning, and predictive analytics to achieve a self-managing mainframe.

BMC AMI Utilities for Db2 are designed for modern complex Db2 environments. They use a centralized, intelligent architecture (see diagram below) designed specifically to handle the complexity facing IT today. Through intelligent policy-driven automation, you can use the AMI Utilities for Db2 to manage growing amounts of data with ease and, at the same time, deliver full application availability.

Figure 1. BMC AMI Utilities for Db2

If you are looking to reduce CPU and elapsed time by as much as 75%, eliminate downtime while delivering full application availability, lower disk usage, eliminate sort in your REORGs, and simplify complex utility operations, then it makes sense to take a look at the BMC AMI Utilities for Db2.

----------

You might also want to take a look at this blog post from BMC that discusses how to Save Time and Money with Updated Unload Times

And this analysis of the BMC next generation REORG technology from Ptak Associates

Thursday, June 25, 2020

Db2 12 for z/OS Function Level 507

This month, June 2020, IBM introduced a new function level, FL507, for Db2 12 for z/OS. This is the first new function level this year, and the first since October 2019. The Function Level process was designed to release Db2 functionality using Continuous Delivery (CD) in short, quick bursts. However, it seems that the global COVID-19 pandemic slowed things a bit… and that, of course, is understandable. But now we have some new Db2 for z/OS capabilities to talk about for this first time in a little bit!

There are four significant impacts of this new function level:

Application granularity for locking limits
Deletion of old statistics from the Db2 Catalog when using profiles
CREATE OR REPLACE capability for stored procedures
Passthrough-only expressions with IBM Db2 Analytics Accelerator (IDAA)

Let’s take a quick look at each of these new things.

The first new capability is the addition of application granularity for locking limits. Up until now, the only way to control locking limits was with NUMLKUS and NUMLKTS subsystem parameters, and they applied to the entire subsystem.

NUMLKTS defines the threshold for the number of page locks that can be concurrently held for any single table space by any single DB2 application (thread). When the threshold is reached, DB2 escalates all page locks for objects defined as LOCKSIZE ANY according to the following rules:

All page locks held for data in segmented table spaces are escalated to table locks.
All page locks held for data in partitioned table spaces are escalated to table space locks.

NUMLKUS defines the threshold for the total number of page locks across all table spaces that can be concurrently held by a single DB2 application. When any given application attempts to acquire a lock that would cause the application to surpass the NUMLKUS threshold, the application receives a resource unavailable message (SQLCODE of -904).

Well, now we have two new built-in global variables to support application granularity for locking limits.

The first is SYSIBMADM.MAX_LOCKS_PER_TABLESPACE and it is similar to the NUMLKTS parameter. It can be set to an integer value for the maximum number of page, row, or LOB locks that the application can hold simultaneously in a table space. If the application exceeds the maximum number of locks in a single table space, lock escalation occurs.

The second is SYSIBMADM.MAX_LOCKS_PER_USER and it is similar to the NUMLKUS parameter. You can set it to an integer value that specifies the maximum number of page, row, or LOB locks that a single application can concurrently hold for all table spaces. The limit applies to all table spaces that are defined with the LOCKSIZE PAGE, LOCKSIZE ROW, or LOCKSIZE ANY options.

The next new capability is the deletion of old statistics when using profiles. When you specify the USE PROFILE option with RUNSTATS, Db2 collects only those statistics that are included in the specified profile. Once function level 507 is activated, Db2 will delete any existing statistics for the object(s) that are not part of the profile. This means that all frequency, key cardinality, and histogram statistics that are not included in the profile are deleted from the Db2 Catalog for the target object.

This is a welcome new behavior because it makes it easier to remove old and stale distribution statistics. Keep in mind that this new behavior also applies when you use profiles to gather inline statistics with the REORG TABLESPACE and LOAD utilities.

Another great new capability that stored procedure users have been waiting for for some time now is the ability to specify CREATE OR REPLACE for procedures. This means that you do not have to first DROP a procedure if you want to modify it. You can simply specify CREATE OR REPLACE PROCEDURE and if it already exists, the procedure will be replaced, and if not, it will be created. This capability has been available in other DBMS products that support stored procedures for a while and it is good to see it come to Db2 for z/OS!

Additionally, for native SQL procedures, you can use the OR REPLACE clause on a CREATE PROCEDURE statement in combination with a VERSION clause to replace an existing version of the procedure, or to add a new version of the procedure. When you reuse a CREATE statement with the OR REPLACE clause to replace an existing version or to add a new version of a native SQL procedure, the result is similar to using an ALTER PROCEDURE statement with the REPLACE VERSION or ADD VERSION clause. If the OR REPLACE clause is specified on a CREATE statement and a procedure with the specified name does not yet exist, the clause is ignored and a new procedure is still created.

And finally, we have support for passthrough-only expressions to IDAA. This is needed because you may want to use an expression that exists on IDAA, but not on Db2 12 for z/OS. With a passthrough-only expression, Db2 for z/OS simply verifies that the data types of the parameters are valid for the functions. The expressions get passed over to IDAA, and the accelerator engine does all other function resolution processing and validation.

What new expressions does FL507 support you may ask? Well all of the following built-in functions are now supported as passthrough-only expressions to IDAA:

ADD_DAYS
BTRIM
DAYS_BETWEEN
NEXT_MONTH
Regression functions

REGR_AVGX
REGR_AVGY
REGR_COUNT
REGR_INTERCEPT
REGR_ICPT
REGR_R2
REGR_SLOPE
REGR_SXX
REGR_SXY
REGR_SYY

ROUND_TIMESTAMP (when invoked with a DATE expression)

You can find more details on the regression functions from IBM here.

Summary

These new capabilities are all nice, new features that you should take a look at, especially if you have applications and use cases where they can help.

The enabling APAR for FL507 is PH24371. There are no incompatible changes with FL 507. But be sure to read the instructions for activation details and Db2 Catalog impacts for DL 507.

Wednesday, July 10, 2019

There’s a New Db2 12 for z/OS Function Level (505)

In late June 2019, IBM delivered more great new capabilities with the latest new function level for Db2 12 for z/OS, Function Level 505 (or FL505).

If you do not know what a function level is, then you probably aren’t yet on Version 12, because function levels are how new capabilities are being rolled out for Db2 12 and beyond. It is how IBM has enabled a continuous delivery model for Db2 functionality. You can learn more about function levels here.

Although the first link above goes into all of the gory details of the new functionality, I will take a bit of time to summarize the highlights of this new function level.

The first thing that will appeal to most Db2 users is improved performance. And FL505 delivers improved performance in two areas: HTAP and RUNSTATS.

For HTAP, FL505 improves the latency between Db2 and the IBM Analytics Accelerator (sometimes called IDAA). Nobody likes latency and these improvements can enable transactional and analytical applications to see the same data.
For RUNSTATS, FL505 makes the default RUNSTATS behavior to use page sampling for universal table spaces (unless the RUNSTATS specification explicitly states TABLESAMPLE SYSTEM with a value other than AUTO). This will boost RUNSTATS performance. (A nice description of this is provided by Peter Hartmann here.)

FL505 also delivers REBIND phase-in for executing packages. Waiting for a package to be idle (not running) has long been a deterrent to rebinding. Now, you can REBIND a package while it is running. Db2 makes this happen by creating a new copy of the package. When the REBIND completes, new executions of the package will use the newly rebound package and the threads already running with the old package continue to do so successfully until completion.

We also get some new built-in functions (BIFs) in FL505, for encrypting and decrypting data using key labels. You may be aware that Db2 already had functions for encryption and decryption but these functions, introduced back in V9 were not very capable because they required you to provide and manage a password to decrypt the data. The new functions work with key labels: encrypting plain text using ENCRYPT_DATAKEY to a block of encrypted text using a specified algorithm and key label; and decrypting with DECRYPT_DATAKEY to return the block of data decrypted to the specified data type.

And with FL505 we finally get additional functionality for DECFLOAT data type. The DECFLOAT data type was introduced in DB2 9 for z/OS, but it is not widely used because of some shortcoming. But first, what is DECFLOAT? Well, DECFLOAT is basically a combination of DECIMAL and floating-point data types, that is a decimal floating-point or DECFLOAT data type. Specified as DECXFLOAT(n), where the value of n can be either 16 or or 34, representing the number of significant digits that can be stored. A decimal floating-point value is an IEEE 754r number with a decimal point and it can be useful to store and manage very large numbers.

So what is the improvement? Quite simply, it is now possible to specify columns defined as DECFLOAT in an index and as a key in a primary key or a unique key. Unfortunately, there is still no support for DECFLOAT usage in COBOL programs, which will likely continue to hinder its uptake in many shops.

And finally, FL505 improves temporal support for triggers. It delivers the capability to reference system temporal tables and archive-enabled tables in the WHEN clause of your triggers.

Summary

IBM is using function levels to deliver significant new capabilities for Db2 12 for z/OS. It is important for you and your organization to keep up-to-date on this new functionality and to determine where and when it makes sense to introduce it into your Db2 databases and applications.

Also, be aware that if you are not currently running at FL504, moving to FL505 activates all earlier function levels. You can find a list of all the current function levels here.

Wednesday, June 20, 2018

Fast and Effective Db2 for z/OS Test Data Management with BCV5

Perhaps the most significant requirement for coding successful Db2 application programs is having a reasonable set of test data to use during the development process. Without data, there is no way to test your code to make sure it runs. But there are many issues that must be overcome in order to have a useful test data management process. Today we will talk about this along with a key enabling component of such a process, BCV5 from UBS Hainer.

One of the first things that organizations try is to make a copy of the production for testing. But this is easier said than done. You cannot just stop your production databases to make a copy of them for testing. But you still want a fast, consistent copy of the data. Consistent in terms of the units of work and referential integrity. And maybe you just want some of the data, not all of it. And we haven’t even talked about the potential regulatory concerns if you are copying personally identifiable information.

When you initially go to build your test data environment, the tools at your disposal are likely the utilities that came with Db2. This means that you will start with solutions like unloading and loading the data. But the LOAD and UNLOAD utilities are not known for their speed, so this can take a long time to accomplish – both for the initial creation and for any subsequent refreshing of the test data. This is important because test data must be refreshed on a regular basis as application testing is performed. Without the capability to refresh it is impossible to compare test runs and develop your programs consistently.

So, what should you do? Well, the first step is to create a consistent test bed either from scratch or, more likely, from production. And you want to do this efficiently and without interrupting production processing. This core bed of test data can be manipulated to reduce its size and even to satisfy regulatory requirements. With a core set of data you can then develop procedures to copy this data out to the various development and QA environments. To succeed, you need a fast method of populating multiple environments, on demand, from the approved test bed.

A key to achieving such an environment is an efficient Db2 data copying tool like BCV5, which can be used to copy and refresh Db2 data very rapidly. BCV5 copies Db2 table spaces and indexes within the same Db2 subsystem or even between different Db2 subsystems much faster than unloading and reloading the data. Using BCV5 you can deliver speedy copies because it works directly at the VSAM level. As BCV5 copies at the VSAM level it can replace Db2-internal OBIDs with the correct target values. This is significantly more efficient than unloading and loading one row at a time. And it takes away the complicated user-managed OBIDXLAT capability of DSN1COPY.

If you have used DSN1COPY in the past you know that it can be difficult to use; this is not the case with BCV5. With DSN1COPY you must specify a series of parameters that describe the input, such as the PIECESIZE, NUMPARTS, DSSIZE, whether it is a LOB table space or not, and more. BCV5 determines all required values automatically, making things a lot easier and less prone to failure.

And if you use LOB and XML data, and these days who doesn’t, BCV5 handles this data like any other, copying it at the same rate as regular table spaces.

BCV5 copies everything, not just the physical Db2 data, but also all of the associated structures including databases, table spaces, tables, indexes, and even views, triggers, aliases, synonyms, constraints, and so on! And you don’t need to worry if objects already exist; BCV5 will check for compatibility and keep the environment accurate. And all of the functionality you’d expect is there, such as the ability to rename objects between environments and to run the copy job either manually or via a job scheduler. Furthermore, you can interact with BCV5 using either an ISPF or a GUI interface.

Using BCV5, you can even use image copies as the source for your test data. BCV5 can use the most recent image copy, or an older image copy chosen by generation number, timestamp, or data set name pattern. BCV5 can automatically identify the correct image copy data sets and use them as the source for the data to be copied. You can even use BCV5 to refresh indexes using image copies of indexes if they exist.

Keeping Db2 statistics accurate can be another vexing test data issue. Generally speaking, you want to keep statistics up-to-date, but in test you probably want test statistics to mirror production. BCV5 can copy both RUNSTATS and RTS (Real Time Stats) directly from the source environment into the target. There is no need for a separate RUNSTATS job or to do a REORG in order to collect an RTS baseline.

And let’s not forget the most impressive aspect of BCV5, its speed and efficiency. BCV5 runs tasks in parallel with automatic workload balancing to further improve the performance of copying Db2 data. This efficiency comes in three forms: less CPU consumption, less elapsed run time, and a reduction in the management steps which can be automated instead of being done manually.

A case in point, a large automobile manufacturer uses BCV5 to manage its large Db2 test data environment consisting of over 11,000 table space partitions, another 11,000+ index partitions, and 20 LOBs. Before deploying BCV5 the company required hundreds of jobs that took almost 2 weeks to create, configure, and execute. After automating the process with BCV5, the entire process requires only 6 jobs that can refresh the test environments in 91 minutes. Impressive, no?

UBS Hainer markets other tools that augment and assist BCV5. For example, its In-Flight Copy add-on can enable BCV5 to get up-to-the-moment accurate data by gathering information from the Db2 log to make consistent copies of table spaces and indexes. It also offers a Reduction and Masking Data add-on to assist with enforcing privacy regulations in your test data. And BCV4 can be used to duplicate an entire Db2 subsystem.

The bottom line is that setting up test data can be difficult and time-consuming. Without a well-thought-out approach to gathering and refreshing test beds, application developers and quality assurance personnel will run into issues as they try to test Db2 code with corrupted or improper data. If your organization has issues with effectively managing test data for your Db2 for z/OS developers, take a look at UBS Hainer’s BCV5 solution for quickly copying and refreshing Db2 data.