Page 1 of 2
Updating a database
Posted: Mon Apr 07, 1997 3:59 pm
by Thunderstone
If you want to refetch pages modified since the last walk, you can
use the -e option with the appropriate date.
If you want to add a few new pages, specify them on the command line similar
to your example above.
If you want to rewalk your entire site (the most common usage), walk to a
new database, index it, copy any top.html and bottom.html from the old to
the new, then switch old for new. Something like:
gw -w0 -d/httpd/webinator/newdb
http://www.mysite.com/
gw -index -d/httpd/webinator/newdb
cp /httpd/webinator/db/top.html /httpd/webinator/newdb
cp /httpd/webinator/db/bottom.html /httpd/webinator/newdb
mv /httpd/webinator/db /httpd/webinator/olddb
mv /httpd/webinator/newdb /httpd/webinator/db
rm -rf /httpd/webinator/olddb
Updating a database
Posted: Mon Apr 07, 1997 6:39 pm
by Thunderstone
On Mon, 7 Apr 1997, Mark Willson wrote:
Can I just do:
gw -d/httpd/webinator/olddb
http://www.mysite.com/
And just have it update while being used? or do I really have to go
through the tedious process below?
Thanks,
-------------------------------------------------------------------------
| Brock Rozen |
brozen@webdreams.com |
http://www.webdreams.com/~brozen |
-------------------------------------------------------------------------
Updating a database
Posted: Mon Apr 07, 1997 6:42 pm
by Thunderstone
I keep my site index at the webinator site, how often is it rewalked?
Cheerio, <mailto:
ds@asu.edu>, <
http://www.public.asu.edu/~dseeburg/>
Dierk Seeburg, Dept. of Botany, Arizona State University, Tempe, USA
Updating a database
Posted: Mon Apr 07, 1997 8:26 pm
by Thunderstone
About once a month. If the index is not accessed by anyone at all for
over a month, it is assumed inactive and deleted. The originator of the
index is mailed in either case.
-Kai
Kai Getrost | Thunderstone Software - EPI, Inc.
|
info@thunderstone.com
Updating a database
Posted: Tue Apr 08, 1997 10:12 am
by Thunderstone
You could just do:
gw -d/httpd/webinator/olddb -wipe
gw -d/httpd/webinator/olddb -w0
http://www.mysite.com/
gw -d/httpd/webinator/olddb -index
Just keep in mind that searches done while the walk is in progress
will be incomplete because you don't have all of your pages yet.
If you don't have a huge site, the rewalk should happen rather quickly.
Updating a database
Posted: Tue Apr 08, 1997 12:29 pm
by Thunderstone
On Tue, 8 Apr 1997, Mark Willson wrote:
Here's another idea:
Why can't I just do
gw -d/httpd/webinator/db -e"1997-04-08" ?
where the date would be the previous day. That way, it rewalks the entire
site, updates the current database (*thus allowing anybody using it to
benefit from this new walk) and doesn't require me creating a new one or
deleting it.
Am I missing something? Many thanks for all your help,
-------------------------------------------------------------------------
| Brock Rozen |
brozen@webdreams.com |
http://www.webdreams.com/~brozen |
-------------------------------------------------------------------------
Updating a database
Posted: Tue Apr 08, 1997 1:26 pm
by Thunderstone
That was the first suggestion in my first reply.
Don't forget to re-index after walking:
gw -d/httpd/webinator/db -index
Updating a database
Posted: Tue Jan 16, 2001 11:39 am
by slugwater
If I were to use this syntax to wipe my database and start over.
gw -d/httpd/webinator/olddb -wipe
gw -d/httpd/webinator/olddb -w0
http://www.mysite.com/
gw -d/httpd/webinator/olddb -index
where would i add the -j and -x options to retrieve only the pages I want?
Thanks!!
Updating a database
Posted: Tue Jan 16, 2001 11:58 am
by mark
Those are walking options so they should go with the walking command:
gw -d/httpd/webinator/olddb -w0 -j... -x...
http://www.mysite.com/
Updating a database
Posted: Tue Jan 16, 2001 12:28 pm
by slugwater
How can I walk just one directory? I am trying to walk just a product catalog.
http://www.mysite.com/catalog
How would I ignore all other directories?
Thanks!!