reload (-e) vs. rewalk?

User avatar
Thunderstone
Site Admin
Posts: 2504
Joined: Wed Jun 07, 2000 6:20 pm

reload (-e) vs. rewalk?

Post by Thunderstone »



The features of reload vs. rewalk are not clear, and the docs don't
really explain the issues.

My sense from the list is that I should use rewalk until I get all my
flags right, then reload. Is that correct?

When I have my index as I like it, can I just do a

gw -e -"-0 days" -V

to get all new pages using "IF-MODIFIED-SINCE"?


Thanks,

Avi
________________________________________________________________
The Complete Guide to Site Indexing and Local Search Engines
<mailto:nets@searchtools.com> <http://www.searchtools.com>


User avatar
Thunderstone
Site Admin
Posts: 2504
Joined: Wed Jun 07, 2000 6:20 pm

reload (-e) vs. rewalk?

Post by Thunderstone »



-rewalk creates a new pristine temporary database, then walks all of
the Urls previously specified on the command line or in list files for
the selected database. It uses the options specified during the most
recent walk of that database. It does not accept new or different
options or Urls except for -d and -v . Any other options or Urls will
be silently ignored. When walking is completed the entire old database
is replaced by the new database. -rewalk behavior can be roughtly
summarized as:
gw -dnew -create
gw -dnew PREVIOUS_OPTIONS ALL_PREVIOUSLY_SPECIFIED_URLS
rm -r old
mv new old

-e works strictly within the specified existing database. It looks for
any Urls that were visited before the specified date and refetches them
using specified command line option settings. -V makes gw use the HTTP
"if-modified-since" header so that webservers that support it won't
send the whole page if it hasn't been modified since it was last
fetched. There should not be a space between -e and the time period.
The time period should never be less than the time it takes to complete
a full walk (e.g.: -e"-1 day").

You should not use either of these while playing with your option
settings. You should walk manually, using -wipe between tries, until
you're happy with your settings. Once you have a good database, you
can decide whether to use one of the above procedures or to rewalk
manually each time.



justin
Posts: 7
Joined: Sat Jul 01, 2000 4:08 pm

reload (-e) vs. rewalk?

Post by justin »

Does the -e option rebuild the search indices? Can this be prevented by adding the -noindex option like:
gw -d/somedb -noindex -e"-29 days"

Thanks
User avatar
mark
Site Admin
Posts: 5519
Joined: Tue Apr 25, 2000 6:56 pm

reload (-e) vs. rewalk?

Post by mark »

Yes and Yes.