slow likep query

Post Reply
gary.gabriel
Posts: 8
Joined: Thu Mar 14, 2002 1:42 pm

slow likep query

Post by gary.gabriel »

I have the following query:

set likepobeyintersects=1;
select story_id from story where headline\text likep 'mutual fund stock economy @0' order by arrival_time desc;

This works fine, and I get results back in under a second. However, when I add more conditions to the query, it slows down to the point where it takes 80 seconds, which is unusable for me:

set likepobeyintersects=1;
select story_id from story where (headline\text likep 'mutual fund stock economy @0') and (ip_id=1 OR ip_id=3 OR ip_id=9 OR ip_id=10 OR ip_id=10 OR ip_id=11 OR ip_id=12 OR ip_id=16 OR ip_id=22 OR ip_id=24 OR ip_id=2040 OR ip_id=27 OR ip_id=3001 OR ip_id=35 OR ip_id=10015 OR ip_id=10044 OR ip_id=10045 OR ip_id=10052 OR ip_id=10100 OR ip_id=10126 OR ip_id=10127 OR ip_id=10131 OR ip_id=10137 OR ip_id=10142 OR ip_id=10187 OR ip_id=10255 OR ip_id=10260 OR ip_id=10277 OR ip_id=10286 OR ip_id=10290 OR ip_id=10292 OR ip_id=10294 OR ip_id=10295 OR ip_id=10313 OR ip_id=10323 OR ip_id=10331 OR ip_id=10362 OR ip_id=10374 OR ip_id=10381 OR ip_id=10389 OR ip_id=10395 OR ip_id=10431 OR ip_id=10436 OR ip_id=10463 OR ip_id=10491 OR ip_id=10497 OR ip_id=10512 OR ip_id=10687 OR ip_id=10695 OR ip_id=10700 OR ip_id=10728 OR ip_id=11153 OR ip_id=11237 OR ip_id=11307 OR ip_id=11357 OR ip_id=11384 OR ip_id=11390 OR ip_id=11428 OR ip_id=11448 OR ip_id=11460 OR ip_id=11462 OR ip_id=11500 OR ip_id=11509) order by arrival_time desc;

There is an inverted metamorph index on the fields that might appear in the query:

create metamorph inverted index story_headline_text_idx on story(headline\text, ip_id, arrival_time, source, publication_date, categories, companies);

but that index doesn't seem to help. I tried speeding up the query by adding another condition - '$rank>400' - but that didn't help either, and the query is still too slow to be usable. According to 'tsql -V', the 'story_headline_text_idx' is the only index being used for either query. The table is large (2G), and has about 600,000 rows. The box is a Sun 420R with 4G or RAM.

Is there any way to speed up that query?

TIA,

-Gary
User avatar
John
Site Admin
Posts: 2622
Joined: Mon Apr 24, 2000 3:18 pm
Location: Cleveland, OH
Contact:

slow likep query

Post by John »

How many items match the ip_id clause and the LIKEP clause? An IN might be a little more efficient.
John Turnbull
Thunderstone Software
gary.gabriel
Posts: 8
Joined: Thu Mar 14, 2002 1:42 pm

slow likep query

Post by gary.gabriel »

With just the LIKEP clause, about 45000 items match if likeprows was set to 0. After adding the ip_id clause, about 29000 items match. I tried doing an IN instead of OR'ing the ip_id's together, and it didn't seem to make any difference.

Thanks,

-Gary
gary.gabriel
Posts: 8
Joined: Thu Mar 14, 2002 1:42 pm

slow likep query

Post by gary.gabriel »

Does tsql do its own caching of index files? How about vhttpd? If they do their own caching, are there parameters to adjust such as how much memory is used for caching? Or do they just make the open/read system calls on the files and rely on the kernel's filesystem cache?

Thanks.

-Gary
User avatar
Kai
Site Admin
Posts: 1272
Joined: Tue Apr 25, 2000 1:27 pm

slow likep query

Post by Kai »

Index files may be cached from query to query in the same transaction (web hit) by a given SQL handle, however they are not cached across transactions by Texis; the operating system's cache is relied on.

Some cross-transaction caching can be set up in vhttpd with the TxPreOpen... settings (see the manual). Eg. a starting point would be setting TxPreOpenDb to the database, and TxPreOpenNum to 2x the expected typical simultaneous transaction load. However, on many platforms in our experience, this has little performance impact.
Post Reply