Showing posts with label UDFs. Show all posts
Showing posts with label UDFs. Show all posts

Tuesday, February 08, 2022

Db2 User-Defined Functions (UDFs)

A user-defined function, or UDF for short, enables you to customize Db2 to your shop's requirements. It is a very powerful feature that  can be used to add procedural functionality, coded by the user, to Db2. The UDF, once coded and implemented extends the functionality of Db2 by enabling users to specify the UDF in SQL statements just like built-in SQL functions.

Additional functionality and capability aside, it can also be complex to deliver and requires detailed knowledge, additional application development skills, and extra administrative handling.

User-defined functions are ideal for organizations wishing to utilize Db2 and SQL to perform specialized, corporate routines performing business logic and data transformation.

Types of UDFs

There are two ways of creating a user-defined function: you can code your own program from scratch using a traditional programming language; or you can use an existing function as a template, of sorts.

There are two types of user-defined functions that can be written from scratch: scalar functions and table functions. A scalar function is applied to a column or expression and operates on a single value. DB2 offers multiple built-in scalar functions, each of which can be applied to a column value or expression. Examples of built-in scalar functions include LTRIM, SQRT, and SUBSTR. You can read more about Db2 built-in functions in my earlier blog post, A Quick Intro to Db2 SQL Functions.

Table functions are a different type of function that, when invoked, returns an entire table. A table function is specified in the WHERE clause of a SELECT statement taking the place of a table, view, synonym, or alias.

Scalar and table user-defined functions are referred to as external functions, because they are written and developed outside of (or external to) DB2. External UDFs must be written in a host programming language. DB2 UDB user-defined functions can be written in Assembler, C, COBOL, or PL/I.

The final type of user-defined function is a sourced function. A sourced function is based on a function that already exists, that is it can be based on a built-in function or another user-defined function that has already been created. A sourced function can also be based on an existing scalar or column function.

User-defined functions are similar in functionality to application subroutines. But user-defined functions are different because they are used inside of SQL statements. This gives them great power. A user-defined function is not a substitute for an application subroutine, subprogram, or procedure. Instead, user-defined functions are used to extend the functionality of the SQL language.

The Schema

User-defined functions, user-defined distinct types, stored procedures, and triggers all are associated with a schema. By default, the schema name is the authid of the process that issues the CREATE FUNCTION, CREATE DISTINCT TYPE, CREATE PROCEDURE, or CREATE TRIGGER statement.

You can specify a schema name when you create a user-defined function, user-defined type, or trigger. If the schema name is not the same as the SQL authorization ID, then the issuer of the statement must have either SYSADM or SYSCTRL authority, or the authid of the issuing process must have the CREATEIN privilege on the schema.

For example, the following statement creates a user-defined function named FUNCX in the schema named MYSCHEMA:

     CREATE FUNCTION MYSCHEMA.FUNCX ...

If the MYSCHEMA component was not included in the CREATE statement, then the schema would default to the authid of the person (or process) that executed the CREATE statement. In short, the schema is set to the owner of the function. If the CREATE statement was embedded in a program, the owner is the authid of the owner of the plan or package; if the statement is dynamically prepared, the owner is the authid in the CURRENT SQLID special register.

Table Functions

Table functions are different in nature than scalar functions. A table function is designed to return multiple columns and rows. In other words, the output of a table function is itself a table. An example using a table function follows:

SELECT WINNER, WINNER_SCORE,
             LOSER, LOSER_SCORE
      FROM   GAME_RESULTS(5)
      WHERE  LOSER_SCORE = 0;

In this case, the table function GAME_RESULTS( ) is used to return the win/loss statistics for football games. The table function can be used in SQL statements just like a regular DB2 table. The function program is designed to fill the rows and columns of the "table." The input parameter is an INTEGER value corresponding to the week the game was played; if 0 is entered, all weeks are considered. The query above would return all results where the losing team was shut out (had 0 points) during the fifth week of the season.

The following or similar CREATE FUNCTION statement could be used to define the GAME_RESULTS( ) function:

      CREATE FUNCTION GAME_RESULTS(INTEGER)
     RETURNS TABLE (WEEK INTEGER,
                    WINNER CHAR(20),
                    WINNER_SCORE INTEGER,
                    LOSER CHAR(20),
                    LOSER_SCORE INTEGER)
       EXTERNAL NAME GAMES
     
   LANGUAGE C
        PARAMETER STYLE DB2SQL
          NO SQL
          DETERMINISTIC
          NO EXTERNAL ACTION
          FENCED
    
      SCRATCHPAD
    
      FINAL CALL
    
      DISALLOW PARALLEL
    
      CARDINALITY 300;

The parameter identifying this as a table function is the RETURNS TABLE parameter. This parameter is used to define the columns of the table function. The function program must create these rows itself or from another data source such as a flat file.

The value supplied for the CARDINALITY parameter is only an estimate. It is provided to help Db2 optimize statements using the table function. It is possible to return more or fewer rows than is specified in CARDINALITY.

Sourced Functions 

Sourced functions are created from functions that already exist. A sourced function can be based on any existing function, whether built-in (scalar and column) or user-defined (scalar). The biggest reasons that sourced functions are created is to enable functions for user-defined distinct data types. This is required because DB2 implements strong typing. Here is an example of creating a sourced UDF:

 CREATE FUNCTION FINDWORD
    (DOCUMENT, VARCHAR(50))
  RETURNS INTEGER
  SPECIFIC FINDWORDDOC
  SOURCE SPECIFIC FINDWORDCLOB;

In this example a new function, FINDWORD, is created from an existing function FINDWORDCLOB. The function finds the location of the specified word expressed as a VARCHAR(50) value in the supplied DOCUMENT. The function returns an INTEGER indicating the location of the word in the DOCUMENT. DOCUMENT is a user-defined data type based on a CLOB data type.

Summary

UDFs provide powerful functionality that can be used to customize Db2 SQL functionality and standardize your operational functtions and algorithms into your DBMS environment. 

Monday, June 01, 2015

DB2 User-Defined Functions

In my last post here, I talked about the importance of built-in DB2 functions and how they can be used to simplify development. But DB2 also allows us to create our own, user-defined functions to programmatically extend the functionality of DB2.

A user-defined function, or UDF for short, is procedural functionality that you code up and add to DB2. The UDF, after coded and implemented, extends the functionality of DB2 SQL by enabling users to specify the UDF in SQL statements just like built-in SQL functions.
User-defined functions are ideal for organizations wanting to utilize DB2 and SQL to perform specialized routines with your own rules, business logic and data transformation procedures.

UDFs are similar in functionality to application subroutines, but user-defined functions are different because they can be used inside SQL statements. The only way that user-defined functions can be executed is from within an SQL statement. This gives them great power. A user-defined function is not a substitute for an application subroutine, subprogram, or procedure. Instead, user-defined functions extend the functionality of the SQL language.
There are two basic ways to create a user-defined function: You can code your own program from scratch (using either a host programming language or SQL) or you can edit an existing function. When coded, the CREATE FUNCTION statement registers the user-defined function to DB2 for use.

Types of User-Defined Functions (UDFs)

DB2 supports five different types of UDFs depending upon the manner in which they are coded and the type of functionality they support.
  • The first type is the external scalar UDF. Recall, from the prior post, that scalar functions are applied to a column or expression and operate on a single value. Furthermore, external functions are referred to as external because they are written and developed outside of (or external to) DB2. External UDFs must be written in a host programming language. DB2 user-defined functions can be written in Assembler, C, C++, COBOL, Java, or PL/I.
  • The next type of UDF is the external table UDF. Remember that table functions, when called, return an entire table. A table function is specified in the WHERE clause of a SELECT statement taking the place of a table, view, synonym, or alias.
  • A third type of user-defined function can be created from another existing function. This is a sourced function. A sourced function is based on a function that already exists -— it can be based on a built-in function (scalar or column) or another user-defined function that has already been created. External functions are coded from scratch, whereas sourced functions are created based on a pre-existing function and inherit the attributes of the underlying source function.
  • DB2 also supports the creation of SQL scalar user-defined functions. SQL scalar functions return a scalar value and are written entirely using SQL. Fairly extensive functionality can be delivered via SQL scalar functions using SQL procedural language.
  • The fifth, and final type of UDF supported by DB2 was introduced as of DB2 10 for z/OS: the SQL table UDF. SQL table functions are written exclusively in SQL but return a table as a result.
User-defined functions can be used to handle non-traditional data requirements, as well as to build DB2 databases that are customized to your business requirements.

There are a lot of details involved in coding and using UDFs, such as schema management, configuring WLM for execution, SQL restrictions, and more. But I will not be digging into those details in today’s post. I invite you to research the topic and learn how to extend the functionality of your DB2 environment using UDFs. 

Tuesday, May 19, 2015

The Importance of Using DB2 Functions

There are three basic types of built-in functions available to DB2 users: 
  1. aggregate functions, 
  2. scalar functions, and 
  3. table functions. 

It is important that you understand what functions are available – as well as what they do – in order to simplify program development and data access. Functions are invoked simply by specifying the function name and any required operands. A built-in function can be used any place an expression can be used (with some exceptions).

The first type of function is called an aggregate function, or sometimes a column function. Aggregate functions can be used to compute, from a group of rows, a single value for a designated column or expression. This provides the capability to aggregate data, thereby enabling you to perform statistical calculations across many rows with one SQL statement.

An aggregate function can be specified in a WHERE clause only if that clause is part of a subquery of a HAVING clause. In addition, every column name specified in the expression of the aggregate function must be a correlated reference to the same group.

The aggregate functions are AVG, CORRELATION, COUNT, COUNT_BIG, COVARIANCE, COVARIANCE_SAMP, MAX, MIN, STDDEV, SUM, VARIANCE, and XMLAGG.

I won’t be explaining what each of these functions does in today’s blog entry; instead, I suggest that you review their functionality in the SQL Reference Guide (SC19-4066 for DB2 11 for z/OS).

The second type of function is the scalar function, which is probably the most commonly used type of function. However, there are so many scalar functions that it can be difficult to keep track of them all. A scalar function is applied to a column or expression and operates on a single value. Contrast this with the aggregate functions, which are applied to a set of data and return only a single result.

The result of a scalar function is a transformed version of the column or expression being operated on. The transformation of the value is based on the scalar function being applied and the value itself. For example, you might use the CHAR function to convert a DB2 date, time, timestamp, ROWID, floating point, integer, or decimal value to a character value… or you might use the DIGITS function to convert a number to a character string of digits.

Of course, these are just a few simple examples of the many functions available in DB2 for z/OS. There are well over 100 scalar functions, each of which may be useful to minimize the amount of code you write by making DB2 do it, instead!
With so many functions available it would not make sense to cover them all here. Suffice it to say, by acquiring a sound understanding of DB2’s functions many requirements that seem to be difficult at first glance, are achievable using nothing more than SQL (check out this example to clarify what I mean: Sorting Out a Solution in DB2). 

The third type of function supported by DB2 is the table function, which can be specified only in the FROM clause of a query. This is because table functions return columns and resemble a table. There are three basic table functions:
  • ADMIN_TASK_LIST: Returns a table with one row for each defined task in the administrative scheduler task list. The administrative scheduler is covered in Part VI, “DB2 Utilities and and Commands.”
  • ADMIN_TASK_OUTPUTFor an execution of a stored procedure, returns the output parameter values and result sets, if available.
  • ADMIN_TASK_STATUS: Returns a table with one row for each defined task in the administrative scheduler task list, showing the status of the task the last time it was run.

Additional Useful Function Information 

DB2 also offers some other types of built-in functions including:
  • MQSeries Built-In Functions - scalar and table functions for use with IBM’s message queuing software, MQSeries. These functions enable MQSeries messages to be received and sent.
  • XML Built-In Functions - scalar functions to allow applications to efficiently generate and manipulate XML data with DB2.
The final built-in function to be aware of is the RAISE_ERROR Function. It is used to raise an error condition in the SQLCA. The user supplies the SQLSTATE and error description for the error to be raised. The error will be raised with the specified SQLSTATE and a SQLCODE of –438. The RAISE_ERROR function can be used to signal application program and data problems.

What About User-Defined Functions?

DB2 also supports the creation of additional functions (and data types) to supplement the built-in function (and data types) supplied out-of-the-box. User-defined functions (and types) give users the ability to effectively customize DB2 to their shop requirements. The ability to customize is potentially very powerful. It also can be quite complex and requires detailed knowledge, additional application development skills, and administrative dexterity.


We will take a look at UDFs in the next blog post, so check in again next week for that!

Wednesday, October 09, 2013

Using the DISPLAY Command, Part 4

In this fourth entry of our series on the DISPLAY command, we take a look at using the DISPLAY command to monitor DB2 stored procedures and user-defined functions. Part 1 of this series focused on using DISPLAY to monitor details about you database objects; Part 2 focused on using DISPLAY to monitor your DB2 buffer pools. And Part 3 covered utility execution and log information. 

If your organization uses stored procedures and/or user-defined functions (UDFs), the DISPLAY command once again comes in handy. 

Stored Procedures

You can use the DISPLAY PROCEDURE command to monitor stored procedure statistics. The output will consist of one line for each stored procedure that a DB2 application has accessed. You can qualify stored procedure names with a schema name.

DISPLAY PROCEDURE returns the following information:

  • The status, that is, whether the named procedure is currently started or stopped
  • How many requests are currently executing
  • The high-water mark for concurrently running requests
  • How many requests are currently queued
  • How many times a request has timed out
  • How many times a request has failed
  • The WLM environment in which the stored procedure executes

Here is an example of what will be output by the DISPLAY PROCEDURE command:

DSNX940I = DSNX9DIS DISPLAY PROCEDURE REPORT FOLLOWS
PROCEDURE  STATUS   ACTIVE  QUED  MAXQ  TIMEOUT  FAIL WLM_ENV
CUSTPROC   STARTED    0       0     1      0       0  WLMDB21
SAMPPRC1   STOPQUE    0       5     5      3       0  WLMSAMP
SAMPPRC2   STARTED    2       0     6      0       0  WLMSAMP
GETDATA1   STOPREJ    0       0     1      0       0  WLMDB21
DSNX9DIS DISPLAY PROCEDURE REPORT COMPLETE

DSN9022I = DSNX9COM '-DISPLAY PROC' NORMAL COMPLETION

Keep in mind that the information returned by  DISPLAY PROCEDURE is dynamic. By the time the information is displayed, it is possible that the status could have changed.

User-Defined Functions (UDFs)

For UDFs, you can use the DISPLAY FUNCTION SPECIFIC command to monitor UDF statistics. This command displays one output line for each function that a DB2 application has accessed.  Similar to what is shown for stored procedures, the DISPLAY FUNCTION SPECIFIC command will show:
  • Whether the named function is currently started or stopped, and why
  • How many requests are currently executing
  • The high-water mark for concurrently running requests
  • How many requests are currently queued
  • How many times a request has timed out
  • The WLM environment in which the function executes
When displaying information about stored procedures and UDFs using the DISPLAY PROCEDURE and DISPLAY FUNCTION SPECIFIC commands, a status is returned indicating the state of the procedure or UDF. A procedure or UDF can be in one of four potential states:


STARTED
Requests for the function can be processed
STOPQUE
Requests are queued
STOPREJ
Requests are rejected
STOPABN      
Requests are rejected because of abnormal termination


Summary

When using stored procedures and/or user-defined functions, be sure to use the DISPLAY command to keep track of their status.