Updating a database

User avatar
Thunderstone
Site Admin
Posts: 2504
Joined: Wed Jun 07, 2000 6:20 pm

Updating a database

Post by Thunderstone »




If you want to refetch pages modified since the last walk, you can
use the -e option with the appropriate date.

If you want to add a few new pages, specify them on the command line similar
to your example above.

If you want to rewalk your entire site (the most common usage), walk to a
new database, index it, copy any top.html and bottom.html from the old to
the new, then switch old for new. Something like:

gw -w0 -d/httpd/webinator/newdb http://www.mysite.com/
gw -index -d/httpd/webinator/newdb
cp /httpd/webinator/db/top.html /httpd/webinator/newdb
cp /httpd/webinator/db/bottom.html /httpd/webinator/newdb
mv /httpd/webinator/db /httpd/webinator/olddb
mv /httpd/webinator/newdb /httpd/webinator/db
rm -rf /httpd/webinator/olddb
User avatar
Thunderstone
Site Admin
Posts: 2504
Joined: Wed Jun 07, 2000 6:20 pm

Updating a database

Post by Thunderstone »



On Mon, 7 Apr 1997, Mark Willson wrote:


Can I just do:

gw -d/httpd/webinator/olddb http://www.mysite.com/

And just have it update while being used? or do I really have to go
through the tedious process below?


Thanks,

-------------------------------------------------------------------------
| Brock Rozen | brozen@webdreams.com | http://www.webdreams.com/~brozen |
-------------------------------------------------------------------------


User avatar
Thunderstone
Site Admin
Posts: 2504
Joined: Wed Jun 07, 2000 6:20 pm

Updating a database

Post by Thunderstone »

User avatar
Thunderstone
Site Admin
Posts: 2504
Joined: Wed Jun 07, 2000 6:20 pm

Updating a database

Post by Thunderstone »





About once a month. If the index is not accessed by anyone at all for
over a month, it is assumed inactive and deleted. The originator of the
index is mailed in either case.


-Kai


Kai Getrost | Thunderstone Software - EPI, Inc.
| info@thunderstone.com


User avatar
Thunderstone
Site Admin
Posts: 2504
Joined: Wed Jun 07, 2000 6:20 pm

Updating a database

Post by Thunderstone »




You could just do:
gw -d/httpd/webinator/olddb -wipe
gw -d/httpd/webinator/olddb -w0 http://www.mysite.com/
gw -d/httpd/webinator/olddb -index
Just keep in mind that searches done while the walk is in progress
will be incomplete because you don't have all of your pages yet.
If you don't have a huge site, the rewalk should happen rather quickly.
User avatar
Thunderstone
Site Admin
Posts: 2504
Joined: Wed Jun 07, 2000 6:20 pm

Updating a database

Post by Thunderstone »



On Tue, 8 Apr 1997, Mark Willson wrote:


Here's another idea:

Why can't I just do
gw -d/httpd/webinator/db -e"1997-04-08" ?

where the date would be the previous day. That way, it rewalks the entire
site, updates the current database (*thus allowing anybody using it to
benefit from this new walk) and doesn't require me creating a new one or
deleting it.

Am I missing something? Many thanks for all your help,

-------------------------------------------------------------------------
| Brock Rozen | brozen@webdreams.com | http://www.webdreams.com/~brozen |
-------------------------------------------------------------------------


User avatar
Thunderstone
Site Admin
Posts: 2504
Joined: Wed Jun 07, 2000 6:20 pm

Updating a database

Post by Thunderstone »




That was the first suggestion in my first reply.
Don't forget to re-index after walking:
gw -d/httpd/webinator/db -index
slugwater
Posts: 33
Joined: Fri Jan 12, 2001 1:27 pm

Updating a database

Post by slugwater »

If I were to use this syntax to wipe my database and start over.

gw -d/httpd/webinator/olddb -wipe

gw -d/httpd/webinator/olddb -w0 http://www.mysite.com/

gw -d/httpd/webinator/olddb -index

where would i add the -j and -x options to retrieve only the pages I want?

Thanks!!
User avatar
mark
Site Admin
Posts: 5514
Joined: Tue Apr 25, 2000 6:56 pm

Updating a database

Post by mark »

Those are walking options so they should go with the walking command:

gw -d/httpd/webinator/olddb -w0 -j... -x... http://www.mysite.com/
slugwater
Posts: 33
Joined: Fri Jan 12, 2001 1:27 pm

Updating a database

Post by slugwater »

How can I walk just one directory? I am trying to walk just a product catalog.

http://www.mysite.com/catalog

How would I ignore all other directories?

Thanks!!
Post Reply