Page 1 of 2

Updating a database

Posted: Mon Apr 07, 1997 3:59 pm
by Thunderstone



If you want to refetch pages modified since the last walk, you can
use the -e option with the appropriate date.

If you want to add a few new pages, specify them on the command line similar
to your example above.

If you want to rewalk your entire site (the most common usage), walk to a
new database, index it, copy any top.html and bottom.html from the old to
the new, then switch old for new. Something like:

gw -w0 -d/httpd/webinator/newdb http://www.mysite.com/
gw -index -d/httpd/webinator/newdb
cp /httpd/webinator/db/top.html /httpd/webinator/newdb
cp /httpd/webinator/db/bottom.html /httpd/webinator/newdb
mv /httpd/webinator/db /httpd/webinator/olddb
mv /httpd/webinator/newdb /httpd/webinator/db
rm -rf /httpd/webinator/olddb

Updating a database

Posted: Mon Apr 07, 1997 6:39 pm
by Thunderstone


On Mon, 7 Apr 1997, Mark Willson wrote:


Can I just do:

gw -d/httpd/webinator/olddb http://www.mysite.com/

And just have it update while being used? or do I really have to go
through the tedious process below?


Thanks,

-------------------------------------------------------------------------
| Brock Rozen | brozen@webdreams.com | http://www.webdreams.com/~brozen |
-------------------------------------------------------------------------



Updating a database

Posted: Mon Apr 07, 1997 6:42 pm
by Thunderstone


I keep my site index at the webinator site, how often is it rewalked?

Cheerio, <mailto:ds@asu.edu>, <http://www.public.asu.edu/~dseeburg/>
Dierk Seeburg, Dept. of Botany, Arizona State University, Tempe, USA



Updating a database

Posted: Mon Apr 07, 1997 8:26 pm
by Thunderstone




About once a month. If the index is not accessed by anyone at all for
over a month, it is assumed inactive and deleted. The originator of the
index is mailed in either case.


-Kai


Kai Getrost | Thunderstone Software - EPI, Inc.
| info@thunderstone.com



Updating a database

Posted: Tue Apr 08, 1997 10:12 am
by Thunderstone



You could just do:
gw -d/httpd/webinator/olddb -wipe
gw -d/httpd/webinator/olddb -w0 http://www.mysite.com/
gw -d/httpd/webinator/olddb -index
Just keep in mind that searches done while the walk is in progress
will be incomplete because you don't have all of your pages yet.
If you don't have a huge site, the rewalk should happen rather quickly.

Updating a database

Posted: Tue Apr 08, 1997 12:29 pm
by Thunderstone


On Tue, 8 Apr 1997, Mark Willson wrote:


Here's another idea:

Why can't I just do
gw -d/httpd/webinator/db -e"1997-04-08" ?

where the date would be the previous day. That way, it rewalks the entire
site, updates the current database (*thus allowing anybody using it to
benefit from this new walk) and doesn't require me creating a new one or
deleting it.

Am I missing something? Many thanks for all your help,

-------------------------------------------------------------------------
| Brock Rozen | brozen@webdreams.com | http://www.webdreams.com/~brozen |
-------------------------------------------------------------------------



Updating a database

Posted: Tue Apr 08, 1997 1:26 pm
by Thunderstone



That was the first suggestion in my first reply.
Don't forget to re-index after walking:
gw -d/httpd/webinator/db -index

Updating a database

Posted: Tue Jan 16, 2001 11:39 am
by slugwater
If I were to use this syntax to wipe my database and start over.

gw -d/httpd/webinator/olddb -wipe

gw -d/httpd/webinator/olddb -w0 http://www.mysite.com/

gw -d/httpd/webinator/olddb -index

where would i add the -j and -x options to retrieve only the pages I want?

Thanks!!

Updating a database

Posted: Tue Jan 16, 2001 11:58 am
by mark
Those are walking options so they should go with the walking command:

gw -d/httpd/webinator/olddb -w0 -j... -x... http://www.mysite.com/

Updating a database

Posted: Tue Jan 16, 2001 12:28 pm
by slugwater
How can I walk just one directory? I am trying to walk just a product catalog.

http://www.mysite.com/catalog

How would I ignore all other directories?

Thanks!!