searching slow on one column

Post Reply
MiniMe
Posts: 210
Joined: Thu Mar 15, 2001 4:30 pm

searching slow on one column

Post by MiniMe »

We have a column that is the price of a product and is set to data type float. We have an index on it like this:
tsql "create index xretprice on books(RETAIL_PRICE)"

but when do a query for a price between two values the search is incredibly slow. Any ideas on how we could optimize this?

our query is like this:
and RETAIL_PRICE between $$qprice and $$qpricehi
User avatar
John
Site Admin
Posts: 2622
Joined: Mon Apr 24, 2000 3:18 pm
Location: Cleveland, OH
Contact:

searching slow on one column

Post by John »

That depends on what comes before the AND. If it is a LIKE query then you should create a compound metamorph index including RETAIL_PRICE as an additional field in the index.

You might try lowering btreethreshold, e.g.

set btreethreshold = 10;

before the query, which will cause texis to only use the index if a smaller, more manageable set of records match. The default is 50, meaning if 50% or less of the records match the price range, then it will use the index. The assumption is that it will be faster to look through the extra records to see if they match than to process the index.
John Turnbull
Thunderstone Software
MiniMe
Posts: 210
Joined: Thu Mar 15, 2001 4:30 pm

searching slow on one column

Post by MiniMe »

Here is the entire query:

<$sqlq="select " $return " from " $table " where
AUTHOR like $$qauth and TITLE like $$qtit
and AUTHOR\TITLE like $$qat
and AUTHOR = $$qcauth and TITLE = $$qctit
and BASIC\LC_SUBJECT\GEO_CODE\KEYWORDS like $$qtopic
and BASIC\LC_SUBJECT\GEO_CODE\KEYWORDS liker $$qtopicr
and AUTHOR\TITLE\PUBLISHER\MAIN_TOPIC\KEYWORDS\EDITION\COMMENTS\NOTES\PLACE_PUB\DATE_PUB like $$query
and AUTHOR\TITLE\PUBLISHER\MAIN_TOPIC\KEYWORDS\EDITION\COMMENTS\NOTES\PLACE_PUB\DATE_PUB like $$newquery
and BOOKNUM = $$bookbin
and id = $$bookid
" $inpart "
and USER_ID = $$quserid
and PUBLISHER like $$qpub
and FL_COVER = $$binding
and FL_SIGNED = $$signed
and FL_EDITION = $$first
and FL_DUSTJACKET = $$dj
and CATLG_ISBN = $$qisbn
and DATE_PUB between $$qyear and $$qyearhi
and DATE_PUB = $$qyearonly
and RETAIL_PRICE between $$qprice and $$qpricehi ">

It only slows when we do the retail_price search portion. It looks like from our script that btreethreshold is set to 5 in that function.. does that carry through the entire execution or need to be explicit before the SQL call??
User avatar
John
Site Admin
Posts: 2622
Joined: Mon Apr 24, 2000 3:18 pm
Location: Cleveland, OH
Contact:

searching slow on one column

Post by John »

The setting will carry through the remainder of the execution of the script. With btreethreshold set to 5 it will not use the index if the price range will match more than 5% of the records, and one of the other parts of the query used an index (I'm assuming that $null is causing some of the query to drop away). If you take the query that will be executed, you can use

tsql -V

which should show you which indexes it is using.
John Turnbull
Thunderstone Software
MiniMe
Posts: 210
Joined: Thu Mar 15, 2001 4:30 pm

searching slow on one column

Post by MiniMe »

Does this provide any insight as to a direction I should head.

bash-2.04# time tsql -V "set btreethreshold ='50';select count(*) from books where BASIC\LC_SUBJECT\GEO_CODE\KEYWORDS like 'Drawing' and RETAIL_PRICE <= '6'" |more
200 Setting pred BASIC\LC_SUBJECT\GEO_CODE\KEYWORDS LIKE 'Drawing' AND RETAIL_PRICE<= 6 on table books in the function settablepred
200 Opening index /db/indexes/prod/xtopic in the function ixfmmindex
200 Opening index /db/indexes/prod/xretprice in the function ixbtindex
200 Expect to read 5% of the index in the function ixbtindex
Texis Version 03.01.971289085(20001011) Copyright (c) 1988-2000 Thunderstone EPI

count(*)
------------+
591

real 0m5.803s
user 0m4.247s
sys 0m0.423s
bash-2.04# time tsql -V "set btreethreshold ='5';select count(*) from books where BASIC\LC_SUBJECT\GEO_CODE\KEYWORDS like 'Drawing' and RETAIL_PRICE <= '6'" |more
200 Setting pred BASIC\LC_SUBJECT\GEO_CODE\KEYWORDS LIKE 'Drawing' AND RETAIL_PRICE<= 6 on table books in the function settablepred
200 Opening index /db/indexes/prod/xtopic in the function ixfmmindex
200 Opening index /db/indexes/prod/xretprice in the function ixbtindex
200 Expect to read 5% of the index in the function ixbtindex
Texis Version 03.01.971289085(20001011) Copyright (c) 1988-2000 Thunderstone EPI

count(*)
------------+
591

real 0m7.302s
user 0m4.504s
sys 0m0.390s
User avatar
John
Site Admin
Posts: 2622
Joined: Mon Apr 24, 2000 3:18 pm
Location: Cleveland, OH
Contact:

searching slow on one column

Post by John »

Both of those queries took about the same amount of CPU time, as they both used both indexes, so the difference in actual time was most likely due to the load on the machine.

I'm not sure how many records you have, but it looks like there are about 12,000 records that match "Drawing", and 5% with a price under 6, so it has to merge those two lists which is taking the time. A compound metamorph index is probably the best solution.
John Turnbull
Thunderstone Software
MiniMe
Posts: 210
Joined: Thu Mar 15, 2001 4:30 pm

searching slow on one column

Post by MiniMe »

There are about 20,000 records that match that Drawing search out of 16 Million records.

when you say compund metamorph index what exactly do you mean and how would I query on price ranges like less than 6 or between 10 and 20 with a metamorph index??

Thanks
User avatar
John
Site Admin
Posts: 2622
Joined: Mon Apr 24, 2000 3:18 pm
Location: Cleveland, OH
Contact:

searching slow on one column

Post by John »

5% of 16M is 800K, which is a lot to process against. A metamorph compound index could be created as:

create metamorph inverted index xtopic(BASIC\LC_SUBJECT\GEO_CODE\KEYWORDS, RETAIL_PRICE)
John Turnbull
Thunderstone Software
Post Reply