choosing search scope

Post Reply
User avatar
Thunderstone
Site Admin
Posts: 2504
Joined: Wed Jun 07, 2000 6:20 pm

choosing search scope

Post by Thunderstone »



Greetings,

If we would like to add the capability of searching either the entire site
or certain subdirectories of the site, which method is preferable:

1) Create separate databases and pass the appropriate db value in the search
form

Example: Create separate databases:
gw -dALL http://www.mysite.com
gw -dSUB -jhttp://www.mysite.com/subdir
http://www.mysite.com

In search form, have something like:
<select name="db">
<option value="ALL">Entire Site</option>
<option value="SUB">Subdirectory
Only</option></select>

2) Use only one database and a MATCHES clause in the SQL statement

Example: Create one database:
gw http://www.mysite.com

In search form, have something like:
<INPUT TYPE=hidden NAME=db VALUE="db">
<select name="path">
<option value="www.mysite.com%">Entire Site</option>
<option value="www.mysite.com/subdir%">Subdirectory
Only</option></select>

In script, have something like:
<sql ... "select ... where Title\Meta\Body likep $q
and Url matches $path ...">

Are there performance issues that makes one better than the other (assuming
that either method would even work)?

Thanks.


User avatar
Thunderstone
Site Admin
Posts: 2504
Joined: Wed Jun 07, 2000 6:20 pm

choosing search scope

Post by Thunderstone »



Either method will work.
The first can often provide better performance for larger databases
and under load.
The second is a little easier to maintain and requires less disk space.

A couple of notes:

In example 1 you have:
gw -dSUB -jhttp://www.mysite.com/subdir http://www.mysite.com
which should be:
gw -dSUB -jhttp://www.mysite.com/subdir http://www.mysite.com/subdir/

In example 2 you have:
<option value="www.mysite.com%">Entire Site</option>
which should be (assuming that there's only one site in the database):
<option value="">Entire Site</option>
so that is doesn't have to do unneeded work when searching the whole site.
(<$null="">, as in the default script, makes the empty parameter "go away".)





Post Reply