Showing posts with label functions. Show all posts
Showing posts with label functions. Show all posts

Thursday, June 02, 2022

Db2 13 for z/OS is Here!

Here we are, in June 2022, about 5 years or so since Db2 12 for z/OS was released. And lo' and behold, IBM has given us a new version of Db2 for z/OS to learn and adopt: Db2 13 for z/OS

The new version is generally available (as of May 31, 2022). If you were not paying close attention though, you may have missed it. Db2 13 was announced at the same time as the new mainframe (IBM z16), so it didn't get quite the same level of attention. But those of us who use Db2 for z/OS day in and day out will find a lot of great new stuff in this latest and greatest version of Db2.

I'm not going to go into great detail about the new features and functionality of Db2 13 for z/OS today, but I will offer a high-level overview. Look for future blog posts to dig into more of the nitty gritty tech details and capabilities.

The first thing to mention is that you will need to activate the last Db2 12 function level 510 (FL510) before you can migrate to Db2 13. As many organizations are lagging behind in terms of function level activation, it will be interesting to see how this requirement impacts migration to Db2 13.

AI

So what can users expect from this new version? Well, it seems that the most talked-about features are related to adopting AI. Functions that deliver AI capabilities into Db2 will make it easier for organizations on the AI journey to integrate Db2 into their processes.

Perhaps the most significant AI addition to Db2 13 is the SQL Data Insights feature. Provided as an extension to Db2, SQL Data Insights is delivered using built-in functions to deliver AI capabilities like uncovering heretofore unknown relationships in your data. Since it uses built-in functions you can use it anywhere that you use SQL!

Db2 13 offers additional AI help delivering the ability to simplify building models, Natural Language Processing (NLP), and exploiting the speed of the IBM z16 for training and querying data.

IBM z16 Synergy

The next thing that has been highly-touted is that Db2 13 takes advantage of new capabilities delivered in the IBM z16 hardware.  The new Telum chip used by the z16 mainframe provides powerful AI capabilities that Db2 uses to bolster its AI capabilities (such as SQL Data Insights). And we have already touched on that in terms of speeding up training and querying data for AI.

Db2 for z/OS is unique in that it is the only major DBMS that is designed specifically for a single operating system (z/OS) and hardware platform (IBM Z). This enables IBM (the provider of the DBMS, O/S, and hardware) to take advantage of capabilities unique to the platform, because there is no worry about supporting other platforms.

One example of this unique synergy is the ability to improve sort performance using the SORTL instruction of the IBM Z15 and z16. Additionally, the IBM z16 System Recovery Boost can minimize downtime by speeding up the performance of Db2 for z/OS restart. 

But What About BAU?

OK, so there is new AI stuff and great synergy with the IBM Z, but what about the features and functionality that make it easier to keep up with Business As Usual (BUA)? You know, things like easier administration, better performance, and so on?

Good news! There are a plethora of great new capabilities and improvements in Db2 13 for z/OS. While I cannot adequately cover them in detail today, some examples include:

  • The ability to convert back-and-forth between partition-by-growth and partition-by-range Db2 table spaces. 
  • Support for more concurrent threads and open data sets, as well as improved storage conditions. 
  • DDF storage relief.
  • Real Time Statistics (RTS) improvements.
  • Many improvements to IBM Db2 utility functionality.
  • Security and compliance improvements, including integration to the IBM Z Security and Compliance Center.

Summary

The bottom line is that there is a new version of Db2 for z/OS that mainframe shops will need to learn and prepare for. As with any new Db2 version, it will be exciting to dig in and discover all of the new stuff that can help us do our jobs better... and improve our organization's efforts to use its data to improve business.

Keep checking back here as I will blog in more detail about the new capabilities of Db2 13 for z/OS over time... 

Monday, January 24, 2022

Db2 DATE and TIME Arithmetic

After turning the last page on the 2021 calendar to embark on another year, I thought it might be a good time to reflect on date and time data... and how it can be used in Db2 for z/OS.


One of the most important database design decisions you will make – for Db2 or, indeed, any relational DBMS – is to use the proper data types for your columns when building tables. The data type that you choose for each column should be the one that most closely matches the domain of values that the column can be used to store.

Perhaps the most important design choice is to use the date and time data types that are available to you; for Db2 this is DATE, TIME, and TIMESTAMP. The ability to store dates as a native Db2 data type is a great advantage. If you need to store date information in your Db2 tables you should favor using the DATE data type instead of forcing a date into some other data type (such as CHAR or DECIMAL). Many a database design has been ruined (in my opinion) because somebody decided to store date data in a CHAR column.

When Db2 knows that the data should be a DATE or a TIME it can force data integrity such that no non-date/time value could ever be stored in the column. This is a big advantage, but it is not the only one. Db2 also provides numerous display formats so date and time values can be displayed in many different ways without having to store them in specific display formats.

Another big reason is that Db2 allows users to perform date/time arithmetic. So, you can easily use date columns to calculate durations or past and future dates based on a number of days, months, and years. The same type of arithmetic can be used for time and timestamp data. Just think about the application code you would have to write to manipulate and manage date/time values!

How Does Date/Time Arithmetic Work?

Db2 enables you to add and subtract DATE, TIME, and TIMESTAMP columns. In addition, you can add date and time durations to or subtract them from, such columns.

Arithmetic on date and time data is fairly straightforward. The plus (+) and minus (-) operations can be used on date and time values and durations. A duration is a number used to represent an interval of time. Db2 recognizes four types of durations.

1.       A labeled duration explicitly specifies the type of duration. An example of a labeled duration is 15 MINUTES. Labeled durations can specify the duration in years, months, days, hours, minutes, seconds, or microseconds.

2.       A DATE duration is a DECIMAL(8,0) number that has the format YYYYMMDD. The YYYY represents the number of years in the duration, MM the number of months, and DD the number of days. When you subtract one date from another, the result is a date duration in this format.

3.       A TIME duration is a DECIMAL(6,0) number with the format HHMMSS. The HH represents the number of hours, MM the number of minutes, and SS the number of seconds. When you subtract one time from another, the result is a time duration in this format.

4.       A TIMESTAMP duration is more complex than date and time durations. The
TIMESTAMP duration is a DECIMAL(20,6) number having the format YYYYXXDDHHMMSSZZZZZZ. The duration represents YYYY years, XX months, DD days, HH hours, MM minutes, SS seconds, and ZZZZZZ microseconds. When you subtract a TIMESTAMP from a TIMESTAMP, you get a TIMESTAMP duration.

So, if you want to add one hour to a TIME column you can simply specify TIME_COL + 1 HOUR. Or subtract a day from a date column easily, such as DATE_COL – 1 DAY. Simple, right?

Well, the rules for date and time arithmetic are somewhat complex. Remember that only addition and subtraction can be performed on date and time data (no division or multiplication). In addition, one of the two operands must be a duration. This stands to reason. For example, two dates cannot be added together, but a duration can be added to a date. The same goes for two times.

Use date and time arithmetic with care. If you understand the capabilities and features of date and time arithmetic, you should have few problems implementing it. Keep the following rules in mind:

                  When you issue date arithmetic statements using durations, do not try to establish a common conversion factor between durations of different types. For example, the date arithmetic statement:
                  DATE(‘2014/04/03’) - 1 MONTH

              is not equivalent to the statement:
                                DATE(‘2014/04/03’) - 30 DAYS

April has 30 days, so the normal response would be to subtract 30 days to subtract one month. The result of the first statement is 2014/03/03, but the result of the second statement is 2014/03/04. In general, use like durations (for example, use months or use days, but not both) when you issue date arithmetic.

                  If one operand is a date, the other operand must be a date or a date duration. If one operand is a time, the other operand must be a time or a time duration. You cannot mix durations and data types with date and time arithmetic.

                  If one operand is a timestamp, the other operand can be a time, a date, a time duration, or a date duration. The second operand cannot be a timestamp. You can mix date and time durations with timestamp data types.

Using Date/Time Functions

Db2 also provides numerous date/time functions that can be used to simply manipulate and modify date/time values. Let’s take a look at an example.

Suppose you want to express the duration resulting from date subtraction as a total-number-of-days (exact total, and not an approximate total)? Consider this query:

     SELECT DATE ('03/01/2004') - '12/01/2003'

It returns a duration of 00000300 (that is, 3 months). And those 3 months encompass a 29-day February plus a 31-day January plus a 31-day December (total 91 days). So the answer that we want to return is 91.

The answer to this conundrum lies in using the DAYS function. The following will return the result as a number of days:

     SELECT DAYS('03/01/2004') - DAYS('12/01/2003')

This query will return to you the exact number of days between the two dates. The DAYS function converts a Db2 date or timestamp into an integer value representing one more than the number of days since January 1, 0001. So converting both dates using the DAYS function and subtracting yields the desired result.

The functions provided by Db2 that can be applied to DATE, TIME, and TIMESTAMP columns can greatly assist your development efforts when using date/time data. Chances are you will find a function to help you with your task at hand, depending on your specific needs. Consider the functions listed in the following table.

DB2 Date/Time Functions

CHAR

Converts a Db2  date, time, timestamp, ROWID, floating point, integer, or decimal value to a character value.

DATE

Converts a value representing a date to a Db2 date. The value to be converted can be a Db2 timestamp, a Db2 date, a positive integer, or a character string.

DAY

Returns the day portion of a Db2 date or timestamp.

DAYOFMONTH

Similar to DAY except DAYOFMONTH cannot accept a date duration or time duration as an argument.

DAYOFWEEK

Converts a date, timestamp, or string representation of a date or timestamp into an integer that represents the day of the week. The value 1 represents Sunday, 2 Monday, 3 Tuesday, 4 Wednesday, 5 Thursday, and so on.

DAYOFYEAR

Converts a date, timestamp, or string representation of a date or timestamp into an integer that represents the day within the year. The value 1 represents January 1st, 2 January 2nd, and so on.

DAYS

Converts a Db2 date or timestamp into an integer value representing one more than the number of days since January 1, 0001.

EXTRACT

Returns a portion of a date or timestamp; you can use EXTRACT to slice up a date/time value into its component pieces.

HOUR

Returns the hour portion of a time, a timestamp, or a duration.

JULIAN_DAY

Converts a Db2 date or timestamp, or character representation of a date or timestamp, into an integer value representing the number of days from January 1, 4712 BC to the date specified in the argument.

MICROSECOND

Returns the microsecond component of a timestamp or the character representation of a timestamp.

MIDNIGHT_SECONDS

Returns the number of seconds since midnight for the specified argument which must be a time, timestamp, or character representation of a time or timestamp.

MINUTE

Returns the minute portion of a time, a timestamp, a character representation of a time or timestamp, or a duration.

MONTH

Returns the month portion of a date, a timestamp, a character representation of a date or timestamp, or a duration.

MONTHS_BETWEEN

Returns an estimate of the number of months between two expressions.

QUARTER

Converts a date, timestamp, or string representation of a date or timestamp into an integer that represents the quarter within the year. The value 1 represents first quarter, 2 second quarter, 3 third quarter, and 4 fourth quarter.

SECOND

Returns the seconds portion of a time, a timestamp, a character representation of a time or timestamp, or a duration.

TIME

Converts a value representing a valid time to a Db2 time. The value to be converted can be a Db2 timestamp, a Db2 time, or a character string.

TIMESTAMP

Obtains a timestamp from another timestamp, a valid character-string representation of a timestamp, or a combination of date and time values (without a time zone).

TIMESTAMPADD

Adds an interval to a timestamp.

TIMESTAMPDIFF

Subtracts two timestamps and returns an interval.

TIMESTAMP_FORMAT

Changes the display format for a timestamp value.

TIMESTAMP_TZ

Returns a timestamp with time zone value.

WEEK

Converts a date, timestamp, or string representation of a date or timestamp into an integer that represents the week within the year (with Sunday as the first day of the week). The value 1 represents the first week of the year, 2 the second week, and so on.

WEEK_ISO

Converts a date, timestamp, or string representation of a date or timestamp into an integer that represents the week within the year (with Monday as the first day of the week).

YEAR

Returns the year portion of a date, a timestamp, or a duration.

A Caveat

Keep in mind that this post addresses date and time data in Db2 for z/OS. Yes, other RDBMS products also support date and time data, but there will be differences and nuances that you will need to understand for each. Here is a nice blog post that discusses working with dates on Oracle, SQL Server, PostgreSQL, and MariaDB if you are interested.

Summary

Using date and time data correctly can be a bit confusing, but the rewards of learning how to use Db2 to help you properly use date and time data are numerous. Do not continue to operate in the void. The wise Db2 professional will learn proper Db2 date and time usage and the vast support built into Db2 for manipulating date and time values.

And the sooner, the better!

That way you can let Db2 perform date and time formatting, integrity checking, and arithmetic… so you won’t have to!

Thursday, June 29, 2017

The DB2 12 for z/OS Blog Series - Part 12: New Built-in Functions

As with most new releases of DB2 for z/OS, at least lately, there are several new built-in functions (or BIFs) that have been added. DB2's BIFs are used to translate data from one form or state to another. They can be used to overcome data format, integrity and transformation issues when you are reading data from DB2 tables. 

So what new things can we do with functions in DB2 12 for z/OS?


The ARRAY_AGG function can be used to build an array from table data. It returns an array in which each value of the input set is assigned to an element of the array. So basically speaking, you can use ARRAY_AGG to read values from rows of a table and convert those values into an array. For example, if I wanted to create an array of name from the EMP table for all females employees I could write it like this:


SET ARRAYNAME = (SELECT LASTNAME FROM DSN8C10.EMP WHERE SEX = 'F'); 

The new part is the ability to use an associative array aggregation. That means that the ARRAY_AGG function is invoked where there is a target user-defined array data type in the same statement, or the result of the ARRAY_AGG function is explicitly cast to a user-defined array data type.


More details can be found here.


Another new capability comes with the LISTAGG function, which is only available as of function level 501. The LISTAGG function aggregates a set of string values for a group into

one string by appending the string-expression values based on the order that is specified in the WITHIN GROUP clause.

So if I needed to create a list of comma-separated names, in alphabetical order grouped by department I could write:


SELECT WORKDEPT,
       LISTAGG(LASTNAME, ’, ’) WITHIN GROUP(ORDER BY LASTNAME)
       AS EMPLOYEES
FROM   DSN8C10.EMP

GROUP BY WORKDEPT;

You can find additional details here.


DB2 12 for z/OS also adds functions for calculating the percentile of a set of values. There are two options:

  • PERCENTILE_CONT
  • PERCENTILE_DISC
The PERCENTILE_CONT function returns a percentile of a set of values treated as a continuous distribution. The calculated percentile is an interpolated value that might not have appeared in the input set.

On the other hand, the PERCENTILE_DISC function returns a percentile of a set of values treated as discrete values. The calculated percentile is always a value that appeared in the input set.


Consider the following two statements:


SELECT PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY COMM),
FROM   DSN8C10.EMP
WHERE  WORKDEPT = 'E21';


The result here, using the sample data, would be 1968.50. There are an even number of rows, so the percentile using the PERCENTILE_CONT function would be determined by interpolation. The average of the value of the two middle rows (1907.00 and 2030.00) is used.

SELECT PERCENTILE_DISC(0.5) WITHIN GROUP (ORDER BY COMM),
FROM EMPLOYEE
WHERE WORKDEPT = 'E21';

The same SQL statement but substituting PERCENTILE_DISC for PERCENTILE_CONT would return 1907.00. Again, the example would return 6 rows (an even number) but instead of an average a discrete value is returned; the value of the first of the two middle rows, which is 1907.00.


Another set of new functions give the ability to generate unique values that can be used for keys:
  • GENERATE_UNIQUE
  • GENERATE_UNIQUE_BINARY
In both cases, the function will return a unique value that includes the internal form of the Universal Time, Coordinated (UTC), and the Sysplex member (for Data Sharing environments). 

For GENERATE_UNIQUE a bit data character string 13 bytes long is returned. That means CHAR(13) FOR BIT DATA.

For GENERATE_UNIQUE_BINARY a BINARY(16) value is returned. Both functions require parentheses without any arguments.


You can use the new WRAP function to obfuscate your database code objects. The function works only on procedural objects (stored procedures, triggers and user-defined functions).

The general idea behind wrapping procedural database objects is to encode a readable data definition statement such that its contents are not easily identified. The procedural logic and embedded SQL statements in an obfuscated data definition statement are scrambled in such a way that any intellectual property in the logic cannot be easily extracted.


A related system stored procedure, CREATE_WRAPPED, is also provided that can be used to obfuscate a readable data definition statement and deploy it in the database. 
Read syntax diagram

More details can be found here and here and here.


Finally, there are a series of new functions for returning hashes. Given an expression, a hash algorithm is applied and the hash value is returned. There are four options:
  • HASH_CRC32
  • HASH_MD5
  • HASH_SHA1
  • HASH_SHA256
The name of the function determines the hashing algorithm that is used and the data type of the result, as shown in the table below:


BIF Algorithm Data Type
HASH_CRC32 CRC32 BINARY(4)
HASH_MD5 MD5 BINARY(16)
HASH_SHA1 SHA1 BINARY(20)
HASH_SHA156 SHA256 BINARY(32)

Summary


The general advice for every release of DB2 holds for DB2 12: always read through the manuals to find the new functions that can be used to minimize the amount of programming and work that needs to be done. It is important for both DBAs (in order to give good advice and be able to review SQL) and programmers (in order to write efficient and effective SQL) to know what functions are available. Be sure to review the new BIFs in DB2 12 and test them out to see how they work and where they can best be used at your shop!

Tuesday, October 06, 2015

Sorting by Day of the Week in DB2

Today's blog will walk you through a sorting "trick" that you can use when you are dealing with days of the week. A relatively common problem involves sorting a column that contains a day-of-the-week not in alphabetical order by in the order it appears in the week...

Assume that you have a table containing transactions, or some other type of interesting facts. The table has a CHAR(3) column containing the name of the day on which the transaction happened; let’s call this column DAY_NAME. Now, let’s further assume that we want to write queries against this table that orders the results by DAY_NAME. We’d want Sunday first, followed by Monday, Tuesday, Wednesday, and so on. How can this be done?

Well, if we write the first query that comes to mind, the results will obviously be sorted improperly:

SELECT   DAY_NAME, COL1, COL2 . . .
FROM     TXN_TABLE
ORDER BY DAY_NAME;

The results from this query would be ordered alphabetically; in other words
  1. FRI
  2. MON
  3. SAT
  4. SUN
  5. THU
  6. TUE
  7. WED

And it would rare to want this type of data sorted in that order, right? The more common need would be to sort the data the way it appears on the calendar.



One solution would be to design the table with an additional numeric or alphabetic column that would allow us to sort that data properly. By this I mean that we could add a DAY_NUM column that would be 1 for Sunday, 2 for Monday, and so on. Then we could SELECT DAY_NAME but sort by DAY_NUM.

But this is a bad fix. Not only does it requires a database design change, this "fix" also introduces the possibility for the DAY_NUM value and DAY_NAME value to get out of sync... unless we are very careful, or perhaps do not allow DAY_NUM to be changed other than via an INSERT trigger that automatically populates the correct number. But requiring a trigger adds even more complexity to this "fix" which really should indicate to us that it is not a very good proposal.

A better solution uses just SQL and requires no change to the database structures. All you need is an understanding of SQL and SQL functions – in this case, the LOCATE function.  Here is the SQL:

SELECT   DAY_NAME, COL1, COL2 . . .
FROM     TXN_TABLE
ORDER BY LOCATE('SUNMONTUEWEDTHUFRISAT',DAY_NAME);

The trick here is to understand how the LOCATE function works: it returns the starting position of the first occurrence of one string within another string. So, in our example, LOCATE finds the position of the DAY_NAME value within the string 'SUNMONTUEWEDTHUFRISAT', and returns the integer value of that position.

So,if DAY_NAME is WED, the LOCATE function in the above SQL statement returns 10... and so on. To summarize, Sunday would return 1, Monday 4, Tuesday 7, Wednesday 10, Thursday 13, Friday 16, and Saturday 19. This means that our results would be in the order we require.

(Note: Some other database management systems have a function similar to LOCATE called INSTR.) 

Of course, you can go one step further if you’d like. Some queries may need to actually return the day of week. You can use the same technique with a twist to return the day of week value given only the day’s name. To turn this into the appropriate day of the week number (that is, a value of 1 through 7), we divide by three, use the INT function on the result to return only the integer portion of the result, and then add one:

INT(LOCATE('SUNMONTUEWEDTHUFRISAT',DAY_NAME)/3) + 1;


Let’s use our previous example of Wednesday again. The LOCATE function returns the value 10. So, INT(10/3) = 3 and add 1 to get 4. And sure enough, Wednesday is the fourth day of the week. 

You can monkey with the code samples here to modify the results to your liking. For example, if you prefer Monday to be the first day of the week and Sunday the last day of the week, simply change the text string in the LOCATE function as follows:

LOCATE('MONTUEWEDTHUFRISATSUN',DAY_NAME)

My intent with this blog posting is actually twofold. Firstly, I wanted to share a method of sorting days-of-the-week... but secondly, I hope to also have piqued your interest in using SQL functions to achieve all types of different processing requirements. DB2 offers quite a lot of functions and it seems to me that many programmers fail to use functions to their fullest.

So if you are ever face-to-face with a difficult query request, step back and take a moment to consider how DB2 functions might be able to help!

Monday, June 01, 2015

DB2 User-Defined Functions

In my last post here, I talked about the importance of built-in DB2 functions and how they can be used to simplify development. But DB2 also allows us to create our own, user-defined functions to programmatically extend the functionality of DB2.

A user-defined function, or UDF for short, is procedural functionality that you code up and add to DB2. The UDF, after coded and implemented, extends the functionality of DB2 SQL by enabling users to specify the UDF in SQL statements just like built-in SQL functions.
User-defined functions are ideal for organizations wanting to utilize DB2 and SQL to perform specialized routines with your own rules, business logic and data transformation procedures.

UDFs are similar in functionality to application subroutines, but user-defined functions are different because they can be used inside SQL statements. The only way that user-defined functions can be executed is from within an SQL statement. This gives them great power. A user-defined function is not a substitute for an application subroutine, subprogram, or procedure. Instead, user-defined functions extend the functionality of the SQL language.
There are two basic ways to create a user-defined function: You can code your own program from scratch (using either a host programming language or SQL) or you can edit an existing function. When coded, the CREATE FUNCTION statement registers the user-defined function to DB2 for use.

Types of User-Defined Functions (UDFs)

DB2 supports five different types of UDFs depending upon the manner in which they are coded and the type of functionality they support.
  • The first type is the external scalar UDF. Recall, from the prior post, that scalar functions are applied to a column or expression and operate on a single value. Furthermore, external functions are referred to as external because they are written and developed outside of (or external to) DB2. External UDFs must be written in a host programming language. DB2 user-defined functions can be written in Assembler, C, C++, COBOL, Java, or PL/I.
  • The next type of UDF is the external table UDF. Remember that table functions, when called, return an entire table. A table function is specified in the WHERE clause of a SELECT statement taking the place of a table, view, synonym, or alias.
  • A third type of user-defined function can be created from another existing function. This is a sourced function. A sourced function is based on a function that already exists -— it can be based on a built-in function (scalar or column) or another user-defined function that has already been created. External functions are coded from scratch, whereas sourced functions are created based on a pre-existing function and inherit the attributes of the underlying source function.
  • DB2 also supports the creation of SQL scalar user-defined functions. SQL scalar functions return a scalar value and are written entirely using SQL. Fairly extensive functionality can be delivered via SQL scalar functions using SQL procedural language.
  • The fifth, and final type of UDF supported by DB2 was introduced as of DB2 10 for z/OS: the SQL table UDF. SQL table functions are written exclusively in SQL but return a table as a result.
User-defined functions can be used to handle non-traditional data requirements, as well as to build DB2 databases that are customized to your business requirements.

There are a lot of details involved in coding and using UDFs, such as schema management, configuring WLM for execution, SQL restrictions, and more. But I will not be digging into those details in today’s post. I invite you to research the topic and learn how to extend the functionality of your DB2 environment using UDFs. 

Tuesday, May 19, 2015

The Importance of Using DB2 Functions

There are three basic types of built-in functions available to DB2 users: 
  1. aggregate functions, 
  2. scalar functions, and 
  3. table functions. 

It is important that you understand what functions are available – as well as what they do – in order to simplify program development and data access. Functions are invoked simply by specifying the function name and any required operands. A built-in function can be used any place an expression can be used (with some exceptions).

The first type of function is called an aggregate function, or sometimes a column function. Aggregate functions can be used to compute, from a group of rows, a single value for a designated column or expression. This provides the capability to aggregate data, thereby enabling you to perform statistical calculations across many rows with one SQL statement.

An aggregate function can be specified in a WHERE clause only if that clause is part of a subquery of a HAVING clause. In addition, every column name specified in the expression of the aggregate function must be a correlated reference to the same group.

The aggregate functions are AVG, CORRELATION, COUNT, COUNT_BIG, COVARIANCE, COVARIANCE_SAMP, MAX, MIN, STDDEV, SUM, VARIANCE, and XMLAGG.

I won’t be explaining what each of these functions does in today’s blog entry; instead, I suggest that you review their functionality in the SQL Reference Guide (SC19-4066 for DB2 11 for z/OS).

The second type of function is the scalar function, which is probably the most commonly used type of function. However, there are so many scalar functions that it can be difficult to keep track of them all. A scalar function is applied to a column or expression and operates on a single value. Contrast this with the aggregate functions, which are applied to a set of data and return only a single result.

The result of a scalar function is a transformed version of the column or expression being operated on. The transformation of the value is based on the scalar function being applied and the value itself. For example, you might use the CHAR function to convert a DB2 date, time, timestamp, ROWID, floating point, integer, or decimal value to a character value… or you might use the DIGITS function to convert a number to a character string of digits.

Of course, these are just a few simple examples of the many functions available in DB2 for z/OS. There are well over 100 scalar functions, each of which may be useful to minimize the amount of code you write by making DB2 do it, instead!
With so many functions available it would not make sense to cover them all here. Suffice it to say, by acquiring a sound understanding of DB2’s functions many requirements that seem to be difficult at first glance, are achievable using nothing more than SQL (check out this example to clarify what I mean: Sorting Out a Solution in DB2). 

The third type of function supported by DB2 is the table function, which can be specified only in the FROM clause of a query. This is because table functions return columns and resemble a table. There are three basic table functions:
  • ADMIN_TASK_LIST: Returns a table with one row for each defined task in the administrative scheduler task list. The administrative scheduler is covered in Part VI, “DB2 Utilities and and Commands.”
  • ADMIN_TASK_OUTPUTFor an execution of a stored procedure, returns the output parameter values and result sets, if available.
  • ADMIN_TASK_STATUS: Returns a table with one row for each defined task in the administrative scheduler task list, showing the status of the task the last time it was run.

Additional Useful Function Information 

DB2 also offers some other types of built-in functions including:
  • MQSeries Built-In Functions - scalar and table functions for use with IBM’s message queuing software, MQSeries. These functions enable MQSeries messages to be received and sent.
  • XML Built-In Functions - scalar functions to allow applications to efficiently generate and manipulate XML data with DB2.
The final built-in function to be aware of is the RAISE_ERROR Function. It is used to raise an error condition in the SQLCA. The user supplies the SQLSTATE and error description for the error to be raised. The error will be raised with the specified SQLSTATE and a SQLCODE of –438. The RAISE_ERROR function can be used to signal application program and data problems.

What About User-Defined Functions?

DB2 also supports the creation of additional functions (and data types) to supplement the built-in function (and data types) supplied out-of-the-box. User-defined functions (and types) give users the ability to effectively customize DB2 to their shop requirements. The ability to customize is potentially very powerful. It also can be quite complex and requires detailed knowledge, additional application development skills, and administrative dexterity.


We will take a look at UDFs in the next blog post, so check in again next week for that!

Monday, April 14, 2014

Aggregating Aggregates Using Nested Table Expressions


Sometimes when you are writing your SQL to access data you come across the need to work with aggregates. Fortunately, SQL offers many simple ways of aggregating data. But what happens when you uncover then need to perform aggregations of aggregates?

What does that mean? Well, consider an example. Let's assume that you  want to compute the average of a sum. This is a reasonably common requirement that comes up frequently in applications that are built around sales amounts. Let's say that we have a table containing sales information, where each sales amount has additional information indicating the salesman, region, district, product, date, etc. 

A common requirement is to produce a report of the average sales by region for a particular period, say the first quarter of 2014. But the data in the table is at a detail level, meaning we have a row for each specific sale.

A novice SQL coder might try to write a query with a function inside a function, like AVG(SUM(SALE_AMT)). Of course, this is invalid SQL syntax. DB2 will not permit the nesting of aggregate functions. But we can use nested table expressions and our knowledge of SQL functions to build the correct query.

Let’s start by creating a query to return the sum of all sales by region for the time period in question. That query should look something like this:

SELECT REGION, SUM(SALE_AMT)
FROM   SALES
WHERE SALE_DATE BETWEEN DATE(‘2014-01-01’)
                AND     DATE(‘2014-03-31’)
GROUP BY REGION;


Now that we have the total sales by region for the period in question, we can embed this query into a nested table expression in another query like so:

SELECT NTE.REGION, AVG(NTE.TOTAL_SALES)
FROM (SELECT REGION, SUM(SALE_AMT)
      FROM   SALES
      WHERE SALE_DATE BETWEEN DATE(‘2014-01-01’)
                      AND     DATE(‘2014-03-31’)
      GROUP BY REGION) AS NTE

GROUP BY NTE.REGION;


And voila! We have aggregated an aggregate, (averaged a sum)...

Wednesday, October 09, 2013

Using the DISPLAY Command, Part 4

In this fourth entry of our series on the DISPLAY command, we take a look at using the DISPLAY command to monitor DB2 stored procedures and user-defined functions. Part 1 of this series focused on using DISPLAY to monitor details about you database objects; Part 2 focused on using DISPLAY to monitor your DB2 buffer pools. And Part 3 covered utility execution and log information. 

If your organization uses stored procedures and/or user-defined functions (UDFs), the DISPLAY command once again comes in handy. 

Stored Procedures

You can use the DISPLAY PROCEDURE command to monitor stored procedure statistics. The output will consist of one line for each stored procedure that a DB2 application has accessed. You can qualify stored procedure names with a schema name.

DISPLAY PROCEDURE returns the following information:

  • The status, that is, whether the named procedure is currently started or stopped
  • How many requests are currently executing
  • The high-water mark for concurrently running requests
  • How many requests are currently queued
  • How many times a request has timed out
  • How many times a request has failed
  • The WLM environment in which the stored procedure executes

Here is an example of what will be output by the DISPLAY PROCEDURE command:

DSNX940I = DSNX9DIS DISPLAY PROCEDURE REPORT FOLLOWS
PROCEDURE  STATUS   ACTIVE  QUED  MAXQ  TIMEOUT  FAIL WLM_ENV
CUSTPROC   STARTED    0       0     1      0       0  WLMDB21
SAMPPRC1   STOPQUE    0       5     5      3       0  WLMSAMP
SAMPPRC2   STARTED    2       0     6      0       0  WLMSAMP
GETDATA1   STOPREJ    0       0     1      0       0  WLMDB21
DSNX9DIS DISPLAY PROCEDURE REPORT COMPLETE

DSN9022I = DSNX9COM '-DISPLAY PROC' NORMAL COMPLETION

Keep in mind that the information returned by  DISPLAY PROCEDURE is dynamic. By the time the information is displayed, it is possible that the status could have changed.

User-Defined Functions (UDFs)

For UDFs, you can use the DISPLAY FUNCTION SPECIFIC command to monitor UDF statistics. This command displays one output line for each function that a DB2 application has accessed.  Similar to what is shown for stored procedures, the DISPLAY FUNCTION SPECIFIC command will show:
  • Whether the named function is currently started or stopped, and why
  • How many requests are currently executing
  • The high-water mark for concurrently running requests
  • How many requests are currently queued
  • How many times a request has timed out
  • The WLM environment in which the function executes
When displaying information about stored procedures and UDFs using the DISPLAY PROCEDURE and DISPLAY FUNCTION SPECIFIC commands, a status is returned indicating the state of the procedure or UDF. A procedure or UDF can be in one of four potential states:


STARTED
Requests for the function can be processed
STOPQUE
Requests are queued
STOPREJ
Requests are rejected
STOPABN      
Requests are rejected because of abnormal termination


Summary

When using stored procedures and/or user-defined functions, be sure to use the DISPLAY command to keep track of their status.