There is a metamorph index on Catno.
slow performance with queries
slow performance with queries
Did you set the index expression so single digits will be indexed? Are there any messages in the html while running the query?
John Turnbull
Thunderstone Software
Thunderstone Software
slow performance with queries
Actually, the categories are of 2 characters, \alpha\digit so that they are not confused with noise words.
I did not changed any of the other webinator settings.
All the html pages have at least one category and in one version I use both '+' and '-' in the search. It doesn't seem to make a difference in the performance with '+' added.
The index expression is only the simple:
create metamorph index xhtmlcat on html(Catno)
When it doesn't time out, it just returns the query with no mesg.
I did not changed any of the other webinator settings.
All the html pages have at least one category and in one version I use both '+' and '-' in the search. It doesn't seem to make a difference in the performance with '+' added.
The index expression is only the simple:
create metamorph index xhtmlcat on html(Catno)
When it doesn't time out, it just returns the query with no mesg.
slow performance with queries
I'm confused by that last statement. Are you saying that you did a query that should have had answers but didn't?
View the source of the results page to see if there are any error or warning messages within html comments.
View the source of the results page to see if there are any error or warning messages within html comments.
slow performance with queries
I get a blank page.
page source is also blank.
page source is also blank.
slow performance with queries
That's very odd. Even for a timeout you should get back a page that says timeout (unless you've set your script timeout to -1 or some huge number and it's the web client or server that's timing out). Check your vortex.log and webserver error log for corresponding events.
You can eliminate the webserver from the equation by performing SQL on the command line with texis.
texis -d /path/to/your/database -s "select ..."
Please also summarize the actual sql statement and queries you're using and what kind(s) of indices are on the fields being queried.
You can eliminate the webserver from the equation by performing SQL on the command line with texis.
texis -d /path/to/your/database -s "select ..."
Please also summarize the actual sql statement and queries you're using and what kind(s) of indices are on the fields being queried.
slow performance with queries
It may have been webserver timeout. It doesn't happen very often, and it is not my main concern which is general slowness.
We run many crawls which feed the resulting databases to a main search database using a unique hash id to keep them distinct. The database now has 3+ million pages and growing. The searches seem to get slower as it grows.
If this doesn't resolve, we may have to split up the db abd piece the results together with relevance ranking. Will that run faster?
We are using the linux version with 2 G ram.
For this query, without the Catno clause, the performance is fine and using likep(not liker,like3) makes the speed tolerable.
Is there a reason why likep should not be used in the query instead of like for Catno?
The results are very different: the rank is completely off, but the pages don't seem to be less relevant.
select Url,Catno,count(*),$rank r from html where Title\Description\Keywords\Meta\Body likep 'breast cancer' and Catno likep 'a1 b5' group by Depth;
Title\Description\Keywords\Meta\Body is a metamorph inverted index
and
Catno is a regular metamorph index
We run many crawls which feed the resulting databases to a main search database using a unique hash id to keep them distinct. The database now has 3+ million pages and growing. The searches seem to get slower as it grows.
If this doesn't resolve, we may have to split up the db abd piece the results together with relevance ranking. Will that run faster?
We are using the linux version with 2 G ram.
For this query, without the Catno clause, the performance is fine and using likep(not liker,like3) makes the speed tolerable.
Is there a reason why likep should not be used in the query instead of like for Catno?
The results are very different: the rank is completely off, but the pages don't seem to be less relevant.
select Url,Catno,count(*),$rank r from html where Title\Description\Keywords\Meta\Body likep 'breast cancer' and Catno likep 'a1 b5' group by Depth;
Title\Description\Keywords\Meta\Body is a metamorph inverted index
and
Catno is a regular metamorph index
slow performance with queries
Also another oddity:
SQL 1>select Url,Catno,count(*),$rank r from html where Title\Description\Keywords\Meta\Body likep 'breast cancer' and Catno likep '+b3 +j9';
Url Catno count(*) r
------------+------------+------------+------------+
http://www.accc-cancer.org/ a0,b3,d2,g1, 100 183
j9 is not in the Catno field.
SQL 1>select Url,Catno,count(*),$rank r from html where Title\Description\Keywords\Meta\Body likep 'breast cancer' and Catno likep '+b3 +j9';
Url Catno count(*) r
------------+------------+------------+------------+
http://www.accc-cancer.org/ a0,b3,d2,g1, 100 183
j9 is not in the Catno field.
slow performance with queries
If the first LIKEP sufficiently reduces the result set the second LIKEP may not use the index, and will only affect the rank value. It will not exclude non-matching records, just lower their rank.
John Turnbull
Thunderstone Software
Thunderstone Software