table join performance

Post Reply
wtian
Posts: 11
Joined: Thu Apr 05, 2001 3:23 pm

table join performance

Post by wtian »

While working on texis, I use tsql for query testing of quick database check.

I experienced big performance issue when I try to join tables.

create table SOURCE_SUBSCRIPTION (
ID counter,
SOURCE_GROUP counter,
SOURCE counter,
STATUS varchar(10),
MDFY_DTTM date,
MDFY_USER varchar(30),
NOTE varchar(1024)
);

create table SOURCE_GROUP (
ID counter,
NAME varchar(50),
STATUS varchar(10),
TYPE varchar(20),
MDFY_DTTM date,
MDFY_USER varchar(30),
NOTE varchar(1024)
);

create table SOURCE (
ID counter,
NAME varchar(50),
TYPE varchar(30),
FREQUENCY long,
NEXTVISIT date,
LASTVISIT date,
LANG varchar(30),
LOCATION varchar(30),
BASEURL varchar(80),
AUTH_INFO varchar(1024),
ROBOT varchar(20),
ACCEPT varchar(1024),
REJECT varchar(1024),
TIMEOUT int,
PAGELIMIT int,
TIMELIMIT int,
VERBOSITY int,
SMETHOD varchar(80),
STORY_TABLE varchar(80),
STATUS varchar(10),
MDFY_DTTM date,
MDFY_USER varchar(30),
NOTE varchar(1024)
);

I have about 4000 rows in SOURCE and SOURCE_SUBSCRIPTION and less than 600 rows in SOURCE_GROUP.
They are indexed on NAME and ID.
except ID in SOURCE_SUBSCRIPTION which I will never query on. (I indexed after accquired data).

The following query takes about 7 seconds to return 26 rows of result.

tsql -d testing "select SOURCE.NAME,SOURCE.TYPE,SOURCE.BASEURL
from SOURCE,SOURCE_GROUP,SOUR
CE_SUBSCRIPTION
where
SOURCE.ID=SOURCE_SUBSCRIPTION.SOURCE and SOURCE_GROUP.ID=SOURCE_SUBSCRIPTION.SOURC
E_GROUP and SOURCE_GROUP.NAME='HITECHUS2'"

However, when I rewrote to the following:

tsql -d testing "select NAME,TYPE,BASEURL
from SOURCE where ID in
(select SOURCE
from SOURCE_SUBSCRIPTION,SOURCE_GROUP
where NAME='HITECHUS2' and SOURCE_GROUP=SOURCE_GROUP.ID)"

I got the same result instantly. Can someone give me any idea why the big difference for such a small table join?

What if there is no alternative for me? Can I tinker somewhere to speedup the join performance?

Another question:
I understand that I need reindex for metamorph index to take effect. Do I need to reindex for regular indexes?

I also noticed that I can not do "select <table>.* from <table>" from tsql I am not sure I can do that kind from vortex.

Thanks for any input.
User avatar
mark
Site Admin
Posts: 5519
Joined: Tue Apr 25, 2000 6:56 pm

table join performance

Post by mark »

The table you're doing the real query against (SOURCE_GROUP) should be listed first in the from and where clauses. All join key fields should be indexed. Relations that narrow the most should be listed first.

Also, if there's a 1:1 relationship between SOURCE and SOURCE_SUBSCRIPTION they should be combined into one table.

Metamorph indices should be kept up to date. They don't need to be totally rebuilt. Issuing the same create statement that was used to create the index will update it to include new/changed records. Also, they will be updated automatically once a day if there is more than 5 megs of change. You can adjust both the time and size of the automatic update by adjusting SYSMETAINDEX. See http://www.thunderstone.com/site/texisman/node302.html
This also assumes that Chkind has not been disabled in texis.cnf. See
http://www.thunderstone.com/site/texisman/node294.html

Non-metamorph indices do not need to be updated.

You can't do "select SOURCE.* ...". You need to enumerate the fields.
bart
Posts: 251
Joined: Wed Apr 26, 2000 12:42 am

table join performance

Post by bart »

You can also use "chkind" to monitor the index status automatically.

Queries against the Metamorph text index will always return the correct results without regard to index update status. The difference will be a slight increase in query time while Texis accounts for the new/updated/deleted records in realtime.
Post Reply