Raghu On Tech
Pro tips on managing data at scaleHow To Find Missing Indexes On Foreign Key Columns ?

Problem Statement:
You are trying to figure out if there are any Foreign key constraints in your database with out the corresponding Index. If you are working in an OLTP environment you already know most of your query patterns include a join on primary key (parent table) and foreign key(s) (child tables). It is important to know and address any of these missing indexes in development/lower environments before they become a problem in production database environments. As basic as it may sound, I have discovered these missing indexes once in a while.
Solution:
Solution is an easy one. You run the below SQL and get the list of Foreign Key constraints with out the corresponding indexes. Once you identify these missing indexes you need to use your application/business knowledge to ask yourself if you need to create these indexes. If you are working with a conventional application, answer to this question will most probably be yes. If you are working with an edge case application where parents don’t talk to children then you got a weird family 🙂
select substr(riinfo.tabschema,1,25) as tabschema, substr(riinfo.tabname,1,25) as tabname, substr(riinfo.constname,1,25) as constname, substr(riinfo.colname,1,25) as colname, substr(idxinfo.indname,1,25) as indname from (select a.tabschema, a.tabname, a.constname, b.colname from syscat.tabconst a, syscat.keycoluse b where a.type = 'F' and a.tabname = b.tabname and a.tabschema = b.tabschema and a.constname = b.constname) riinfo left join (select a.tabschema,a.tabname,a.indname,b.colname from syscat.indexes a, syscat.indexcoluse b where a.indschema = b.indschema and a.indname = b.indname) idxinfo on ( riinfo.tabschema = idxinfo.tabschema and riinfo.tabname = idxinfo.tabname and riinfo.colname = idxinfo.colname) where idxinfo.indname is null order by riinfo.tabname;
When I run this SQL against my database I get the below output, which tells me that DB2 explain tables do not have corresponding indexes for their foreign keys. This might not be that big of a problem if your explain tables do not grow.
However if you are keeping track of say for e.g. your top 10 SQL statements performance over a 5 year period of time and you save their explain information into these explain tables, then you may need to create indexes to avoid performance problems for any reporting that you may perform on these tables.
select substr(riinfo.tabschema,1,25) as tabschema, substr(riinfo.tabname,1,25) as tabname, substr(riinfo.constname,1,25) as constname, substr(riinfo.colname,1,25) as colname, substr(idxinfo.indname,1,25) as indname from (select a.tabschema, a.tabname, a.constname, b.colname from syscat.tabconst a, syscat.keycoluse b where a.type = 'F' and a.tabname = b.tabname and a.tabschema = b.tabschema and a.constname = b.constname) riinfo left join (select a.tabschema,a.tabname,a.indname,b.colname from syscat.indexes a, syscat.indexcoluse b where a.indschema = b.indschema and a.indname = b.indname) idxinfo on ( riinfo.tabschema = idxinfo.tabschema and riinfo.tabname = idxinfo.tabname and riinfo.colname = idxinfo.colname) where idxinfo.indname is null order by riinfo.tabname TABSCHEMA TABNAME CONSTNAME COLNAME INDNAME ------------------------- ---------------------------------------- -------------------------------------------------- ------------------------- ------------------------- TRAVELER ADVISE_INDEX SQL131029095034720 RUN_ID - TRAVELER ADVISE_MQT SQL131029095142470 RUN_ID - TRAVELER ADVISE_PARTITION SQL131029095217570 RUN_ID - TRAVELER ADVISE_TABLE SQL131029095253200 RUN_ID - TRAVELER EXPLAIN_ACTUALS SQL131029095430650 EXPLAIN_LEVEL - TRAVELER EXPLAIN_ACTUALS SQL131029095430650 EXPLAIN_REQUESTER - TRAVELER EXPLAIN_ACTUALS SQL131029095430650 EXPLAIN_TIME - TRAVELER EXPLAIN_ACTUALS SQL131029095430650 OPERATOR_ID - TRAVELER EXPLAIN_ACTUALS SQL131029095430650 SECTNO - TRAVELER EXPLAIN_ACTUALS SQL131029095430650 SOURCE_NAME - TRAVELER EXPLAIN_ACTUALS SQL131029095430650 SOURCE_SCHEMA - TRAVELER EXPLAIN_ACTUALS SQL131029095430650 SOURCE_VERSION - TRAVELER EXPLAIN_ACTUALS SQL131029095430650 STMTNO - TRAVELER EXPLAIN_ARGUMENT SQL131029094618810 EXPLAIN_REQUESTER - TRAVELER EXPLAIN_ARGUMENT SQL131029094618810 SOURCE_NAME - TRAVELER EXPLAIN_ARGUMENT SQL131029094618810 SOURCE_SCHEMA - TRAVELER EXPLAIN_ARGUMENT SQL131029094618810 SOURCE_VERSION - TRAVELER EXPLAIN_DIAGNOSTIC_DATA SQL131029094918880 EXPLAIN_REQUESTER - TRAVELER EXPLAIN_DIAGNOSTIC_DATA SQL131029094918880 SOURCE_NAME - TRAVELER EXPLAIN_DIAGNOSTIC_DATA SQL131029094918880 SOURCE_SCHEMA - TRAVELER EXPLAIN_DIAGNOSTIC_DATA SQL131029094918880 SOURCE_VERSION - TRAVELER EXPLAIN_OBJECT SQL131029094632060 EXPLAIN_REQUESTER - TRAVELER EXPLAIN_OBJECT SQL131029094632060 SOURCE_NAME - TRAVELER EXPLAIN_OBJECT SQL131029094632060 SOURCE_SCHEMA - TRAVELER EXPLAIN_OBJECT SQL131029094632060 SOURCE_VERSION - TRAVELER EXPLAIN_PREDICATE SQL131029094735980 EXPLAIN_REQUESTER - TRAVELER EXPLAIN_PREDICATE SQL131029094735980 SOURCE_NAME - TRAVELER EXPLAIN_PREDICATE SQL131029094735980 SOURCE_SCHEMA - TRAVELER EXPLAIN_PREDICATE SQL131029094735980 SOURCE_VERSION - TRAVELER EXPLAIN_STREAM SQL131029094809960 EXPLAIN_REQUESTER - TRAVELER EXPLAIN_STREAM SQL131029094809960 SOURCE_NAME - TRAVELER EXPLAIN_STREAM SQL131029094809960 SOURCE_SCHEMA - TRAVELER EXPLAIN_STREAM SQL131029094809960 SOURCE_VERSION - 33 record(s) selected.
Hope you enjoyed this short blog post. If you have any similar SQL tricks that you use on a regular basis, share them in the comments section below.
PS: You may find this stackoverflow discussion useful for understanding different types of joins.
One of the readers brought it to my attention that this query is showing duplicate column names for the same table. I figured out that the root cause of this problem is a missing join between syscat.tabconst and syscat.keycoluse.
and a.tabschema = b.tabschema
I added the above join and data is looking much better now.