Monday, March 03, 2008

Q+A: Nulls and DASD

Recently, I received the following question, which I will answer today on this blog posting.

Question: Lets say I have a table A which has 500 columns. Out of those 500 columns only 5 columns have been defined as not nullable and the rest have been defined as NULLS allowed. And out of those 500 columns I have found that 300 columns are unused(empty) totally. My business allows me to remove those 300 columns. My doubt is if I remove those 300 empty columns will I save on DASD space occupied by DB2? Will empty columns occupy DASD space?Would be really helpful if you can guide me on this.

Answer: I'm happy to try to help out. First of all, the short answer to your question is "Yes!" Those 500 columns are all consuming valuable disk space. To determine how much space is being consumed, you will need to examine the data type and length assigned to each column and add them up. And to make matters worse, you must add an additional 1 byte to each of them because the columns are nullable.

In DB2, a NULL is stored using a special one-byte null indicator that is "attached" to every nullable column. If the column is set to NULL, then the indicator field is used to record this. Using NULL will never save space in a DB2 database design - in fact, it will always add an extra byte for every column that can be NULL. The byte is used whether or not the column is actually set to NULL.

So, a column defined as CHAR(5) NOT NULL will required five bytes of storage space - but if it is defined as nullable, then it requires six bytes of storage space - five bytes for the data, and one byte for the null indicator.

Given all of this, it would seem that there is a very viable case to be made for you to remove those columns that are not being used. Of course, this means that you will likely have to make changes to any programs accessing that table. Because the table definition will change (fewer columns) you will need new DCLGENs and those will have to be included and bound into your programs. Be sure to factor this additional workload into your planning before moving forward with this change.

The better question to ask is "How the heck did all of the empty columns get put into the table to begin with and how did that design get past the DBAs?"

If you have an answer for that one, please share it by posting your answer in a comment here!

No comments: