Question on files having the /cgi-bin/ path

User avatar
Thunderstone
Site Admin
Posts: 2504
Joined: Wed Jun 07, 2000 6:20 pm

Question on files having the /cgi-bin/ path

Post by Thunderstone »




To webinator@thunderstone.com


Dear Sirs,
We have indexed our WEB site using webinator
(see home page of "www.ictp.trieste.it")

Have created indices for searching HTML pages with
gw http://www.ictp.trieste.it
gw -index

But, since one of our pages contained
paths to a cgi-bin script we decided to add such links
by first deleting the page in question

(1) gw -s "delete from html where Url='www.ictp.trieste.it/ictp/smr.html'"

and then re-indexing it

(2) gw -v -g -a -C http://www.ictp.trieste.it/ictp/smr.html

Up to here everthing goes OK even with the search (for example
search for "smr1035").
However, if afterwards we type anew

gw -index

Resultys for the same string (for example "smr1035") will not appear
listed as it were before the update of the index (by gw -index).
We select (the defaults) options "sentence" and "Exact".

Could you tell me if I am doing somenthing wrong?
Does this mean that next time that we need to update the database we will
loose the cgi-bin entries and I will need to repeat points (1) and (2)
without doig "-index" afterwards?
Thank you for your advice



--
E. Canessa, Ph.D ICTP-International Centre for Theoretical Physics
PO Box 586 - 34100 Trieste, Italy
Phone: (+39) 040 2240358 FAX: (+39) 040 224163
PGP Public Key at: www.ictp.trieste.it/~canessae
Fingerprint: C0 6E 11 A6 A9 F5 5A DB 85 D1 F6 0F B4 64 86 94
KeyID: 8CF3A569
--


User avatar
Thunderstone
Site Admin
Posts: 2504
Joined: Wed Jun 07, 2000 6:20 pm

Question on files having the /cgi-bin/ path

Post by Thunderstone »




It's not related to cgi-bin. It's your index expression. The default
index expression is 2-99 alphabetics. You need to use the -k option
to allow numerics. e.g.: -k"\alnum{2,99}"

You will need to use -unindex then use the -k option with -index to
rebuild the indices with whatever new expression(s) you decide to use.
User avatar
Thunderstone
Site Admin
Posts: 2504
Joined: Wed Jun 07, 2000 6:20 pm

Question on files having the /cgi-bin/ path

Post by Thunderstone »



On Wed, 23 Apr 1997, Mark Willson wrote:




Thank you very much for your prompt reply.
we have added the options (in this order)
gw -unindex
gw -k"\alnum{2,99}" -index
and now everything is Ok for search like "smr1035".
Some comments received around here on your search engine
are very positive. Please keep us informed of any upgrades
and visit our Web site at "www.ictp.trieste.it"
Yours sincerely
Enrique


--
E. Canessa, Ph.D ICTP-International Centre for Theoretical Physics
PO Box 586 - 34100 Trieste, Italy
Phone: (+39) 040 2240358 FAX: (+39) 040 224163
PGP Public Key at: www.ictp.trieste.it/~canessae
Fingerprint: C0 6E 11 A6 A9 F5 5A DB 85 D1 F6 0F B4 64 86 94
KeyID: 8CF3A569
--