Wednesday, September 03, 2008

When Not to Index

Answering a question I got via e-mail on indexing...

Every now and then I take the opportunity to blog about a question I get through e-mail. This time the question was:

When does it make more sense not to build an index for a DB2 table?

I'll attempt to answer this question for any SQL DBMS, not just for DB2:

First of all, this is a very open-ended question, so I will give a high-level answer. Let's start by saying that most of the time you will want to build at least one - and probably multiple - indexes on each database table that you create. Indexes are crucial for optimizing performance of SQL access. Without an index, queries must scan every row of the table to come up with a result. And that can be very slow.

OK, that being said, here are some times when it might makes sense to have no indexes defined on a table:

  • When all accesses retrieve every row of the table. Because every row will be retrieved every time you want to use the table an index (if used) would just add extra I/O and would decrease, not enhance performance. Though not extremely common, you may indeed come across such tables in your organization.

  • For a very small table with only a few pages of data and no primary key or uniqueness requirements. A very small table (perhaps 10 or 20 pages) might not need an index because simply reading all of the pages is very efficient already.

  • When performance doesn't matter and the table is only accessed very infrequently. But, hey, when do you ever have those type of requirements in the real world?

Other than under these circumstances, you will most likely want to build one or more indexes on each table, not only to optimize performance, but also to ensure uniqueness, to support referential integrity, and perhaps to drive data clustering.

Of course, indexes do not come without cost. Indexes take up disk space and adding a lot of indexes will consume disk space. For some DBMS products, adding many indexes can impact the working set size and perhaps raise memory problems. Additionally, although indexes speed up queries they degrade inserts and deletes, as well as any modification to indexed columns.

What do you think? Are there other situations where a table should have no indexes? Are there any pertinent high-level issues I missed? Feel free to add your thoughts and comments below!

No comments: