SQL query index using
Hi guys,
I ran into a strange (for me) situation, when I run same query but change the WHERE clause the plan is different and is not connected to the additional condition.
Query that doesn't use the necessary index:
SELECT * FROM Portal.ProductStats ps left JOIN Portal.ProductCacheUpdates pcu ON (pcu.Item=ps.Item) WHERE ps.Item=?
Takes 0.4 sec, doesn't use index on item from Portal.ProductCacheUpdates table
SELECT * FROM Portal.ProductStats ps left JOIN Portal.ProductCacheUpdates pcu ON (pcu.Item=ps.Item) WHERE ps.Item=? AND ps.dateImported<pcu.dateEdited
Takes 0.003 sec, uses index on item from Portal.ProductCacheUpdates table
Can somebody explain? What i'm doing wrong? How should I know if Cache will use the index or not? Check every query plan?
Comments
[Caché SQL Optimization Guide > Introduction to SQL Performance Optimization]
In most cases, helps "tune table". But you can try hints (%ALLINDEX, %FIRSTTABLE, etc.)
Thanks!
But:
1) it doesn't help
2) I don't want to guess, I want to know, how and why.
Thanks again :)
We made the Tune, but only after Purge, as Vitaliy suggested, it works fine.
Thanks a lot.
For a more precise explanation:
Table definitions with ALL indices are required and the complete query plans not just a summary.
And also your version ( $ZV )
Class Portal.ProductStats Extends %Persistent
{
Property Item As %String [ Required ];
Property dateImported As %String [ Required ];
Index pcacheUpds On Item;
}
Class Portal.ProductCacheUpdates Extends %Persistent
{
Property Product As %String [ Required ];
Property Item As %String [ Calculated, SqlComputeCode = { s {*}=$e({Product},1,5)}, SqlComputed ];
Property dateEdited As %String [ Required ];
Index pcacheUpdsUniq On (Product, Item) [ Unique ];
Index pcacheUpdsProd On Item;
}
.png)
Cache for Windows (x86-64) 2017.2 (Build 744U) Fri Sep 29 2017 10:58:27 EDT
- Instead of
Property Product As %String [ Required ];Property Item As %String [ Calculated, SqlComputeCode = { s {*}=$e({Product},1,5)}, SqlComputed ];it is better to use
Property Product As %String [ Required ];Property Item As %String(MAXLEN = 5) [ Required, SqlComputeCode = { s {*}=$e({Product},1,5)}, SqlComputed, SqlComputeOnChange = Product ]; -
Check on the version 2017.2 can not, but checked on 2018.1
SELECT * FROM Portal.ProductStats ps left JOIN Portal.ProductCacheUpdates pcu ON (pcu.Item=ps.Item) WHERE ps.Item=?
- ExtentSize=1 (Portal.ProductStats)
ExtentSize=1000 (Portal.ProductCacheUpdates)
Relative cost = 1338 ◾Read master map Portal.ProductStats.IDKEY, looping on ID.
◾For each row:
Read index map Portal.ProductCacheUpdates.pcacheUpdsProd, using the given %SQLUPPER(Item), and looping on ID. For each row: Read master map Portal.ProductCacheUpdates.IDKEY, using the given idkey value. Generate a row padded with NULL for table Portal.ProductCacheUpdates if no row qualified. Output the row.
- ExtentSize=1000 (Portal.ProductStats)
ExtentSize=1 (Portal.ProductCacheUpdates)
Relative cost = 1219.2 ◾Read index map Portal.ProductStats.pcacheUpds, using the given %SQLUPPER(Item), and looping on ID.
◾For each row:
Read master map Portal.ProductStats.IDKEY, using the given idkey value. Read index map Portal.ProductCacheUpdates.pcacheUpdsProd, using the given %SQLUPPER(Item), and looping on ID. For each row: Read master map Portal.ProductCacheUpdates.IDKEY, using the given idkey value. Generate a row padded with NULL for table Portal.ProductCacheUpdates if no row qualified. Output the row.
As you can see in both cases the index pcacheUpdsProd is used.
- ExtentSize=1 (Portal.ProductStats)
ExtentSize=1000 (Portal.ProductCacheUpdates)
Have you really set up the tables and cleared the cached queries so that the optimizer can start using the new statistics?
Try to do it manually in the terminal:
blablabla><FONT COLOR="#0000ff">d $system</FONT><FONT COLOR="#008080">.SQL</FONT><FONT COLOR="#000000">.</FONT><FONT COLOR="#0000ff">TuneSchema</FONT><FONT COLOR="#000000">(</FONT><FONT COLOR="#008000">"Portal"</FONT><FONT COLOR="#000000">,1), </FONT><FONT COLOR="#0000ff">$SYSTEM</FONT><FONT COLOR="#008080">.SQL</FONT><FONT COLOR="#000000">.</FONT><FONT COLOR="#0000ff">Purge</FONT><FONT COLOR="#000000">(), </FONT><FONT COLOR="#0000ff">$system</FONT><FONT COLOR="#008080">.Status</FONT><FONT COLOR="#000000">.</FONT><FONT COLOR="#0000ff">DisplayError</FONT><FONT COLOR="#000000">(</FONT><FONT COLOR="#0000ff">$system</FONT><FONT COLOR="#008080">.OBJ</FONT><FONT COLOR="#000000">.</FONT><FONT COLOR="#0000ff">CompilePackage</FONT><FONT COLOR="#000000">(</FONT><FONT COLOR="#008000">"Portal"</FONT><FONT COLOR="#000000">,</FONT><FONT COLOR="#008000">"cu-d"</FONT><FONT COLOR="#000000">))</FONT>
The documentation describes in detail how and why.
For example, sometimes the optimizer decides not to use the index if a full crawl would be more efficient. It depends on many parameters: ExtentSize, Selectivity, etc.
Give here what Robert asks, and it will be possible to tell more precisely why.
Sergey
The biggest difference between the 2 queries is the second one is really a JOIN, not a Left Outer Join. By adding the second WHERE condition on a property in pcu you are making this a JOIN and allowing the Optimizer to start in either table. I created the tables on my system and here is what I am seeing
The first query starts in ps using the item index and then joins to the pcu table using a temp file built from the master map (same as the plan you show)
The second query is starting with the pcu table and the doing a join to the ps table using a temp file build from the Item index and the master map.
Why don't we use the Item index in pcu? I can't tell you exactly why, but I know the 2 biggest factors we are looking at are the size of the maps on disk and the Selectivity of the properties involved. In both cases, we are going to need to read the master map so the question is this: Will reading in the Index map reduce the number of blocks we need to read of the master map to make it worth the effort? Without proper values generated by TuneTable, or supplied by you, the optimizer is left to guess at what these values are. In the first case, it has decided that reading the index will not reduce the number of block reads of the master map so it is not using it. This generally happens when we think the Master Map is very small.
To take this any further I would need you to run TuneTable on both of these tables and see if the poor plan is still used. If it is still used please provide the storage info for both classes so I can investigate this.
Thanks Vitaliy,
The problem was in cached queries.