Thunderstone Support Forums

Posted: **Fri Oct 06, 2000 5:56 pm**

Hello,

I have a couple of questions about Webinator in general:

Can the search engine exclude certain words (such as "the" or "but")?
Is there any tracking mechanism for searches that returned with no results?
We are interested in tracking what people are looking for that we don't have.

..and I'm encountering a problem that I hope you can solve. I type in this
command:

gw -dtest -mglossary.set http://innovate.tigffx.amsinc.com

And get this error message

Incorrect usage: -
Use: gw [options] URL
or : gw -s[X] [options] SQL
-h : help.
url : Where to begin fetching HTML from.

See http://www.thunderstone.com/webinator/ for full documentation.

Can you tell me why this isn't working? "test" is the name of my database and
"glossary.set" is the name of my options file. Should this options file be in
the /webinator/bin/ directory? That's the directory I run the gw command from.

Thank you for your time.

Michael W. Smith

Posted: **Fri Oct 06, 2000 6:13 pm**

It does by default. They are called noise words.

You can log queries that have no answers to the querylog table
See http://www.thunderstone.com/gw25man/node15.html
And http://www.thunderstone.com/texis/webin ... ==querylog

You can then perform SQL queries against that table to find out what
people are missing.

Looks like you've got the syntax in the option file wrong. Please read
http://www.thunderstone.com/gw25man/node20.html carefully. Do not use
the leading - on options in option files.

Posted: **Mon Oct 09, 2000 2:29 pm**

Use <apicp noise $customnoise> to set the desired noise list to the
list of terms in a variable called $customnoise.

See http://www.thunderstone.com/vortexman/node94.html
And http://www.thunderstone.com/texisman/node205.html

Posted: **Fri Oct 27, 2000 2:00 pm**

Hi Mark,

I recently wrote a java script that makes a copy of all the dynamic pages we
have on our server and copies them into a new directory under my main documents
directory. I was hoping to use webinator to search through these new files that
will be created via a CRON job every night at midnight and then walked by
webinator at 1am every night. However, when I wiped my webinator database and
searched for the files under this new directory, it does not find them. I
thought it might be a web server cache problem, so I bounced the server. This
did not solve the problem...so I waited overnight to try to walk the pages.
Webinator is still not finding the new .html pages. I thought rights to the
files might be an issue, so I gave read and execute rights to webinator. This
still didn't work.

My document directory is docs/
I created the new directory here: docs/Snaphots/

I have almost 1,000 other html files under the docs directories in various
directories that webinator searches through with no problem. I'm at a loss.
Any ideas?

Michael W. Smith

Posted: **Fri Oct 27, 2000 2:07 pm**

Not sure why you generated static pages instead of letting webinator
walk the dynamic ones, but...

Webinator will only walk pages that are linked into the site. ie can be
reached by following hrefs starting at the url(s) you provide on the
command line. You need to link those pages in or give a base url that
contains a list (or directory index) of them.

Thunderstone Support Forums

Webinator Questions

Webinator Questions

Webinator Questions

Webinator Questions

Webinator Questions

Webinator Questions