Extra download required for Webinator with Texis?

User avatar
mark
Site Admin
Posts: 5519
Joined: Tue Apr 25, 2000 6:56 pm

Extra download required for Webinator with Texis?

Post by mark »

oops, sorry. Apparently IE's save as will save carriage returns on the end of every line of the script. The internal config is not expected to have that. Make the following change to webinatoradmin to accomodate that.
Search for
<timport
you'll find a long line ending with
*\x01\x0a=
change that part to
*\x01=\x0d?\x0a=
b.sims
Posts: 99
Joined: Fri Oct 26, 2001 10:40 am

Extra download required for Webinator with Texis?

Post by b.sims »

Bingo! It looks pretty normal now.

One last thing: Is it possible to feed in a list of URLs from a text file as is was in 2.5? This is what I need to do before I go home for the weekend
b.sims
Posts: 99
Joined: Fri Oct 26, 2001 10:40 am

Extra download required for Webinator with Texis?

Post by b.sims »

Also, can I specify that the crawl stop at a certain time (6am Monday morning)? I need it to stop before we open Monday but don't want to get out of bed at 4am.
User avatar
mark
Site Admin
Posts: 5519
Joined: Tue Apr 25, 2000 6:56 pm

Extra download required for Webinator with Texis?

Post by mark »

You can feed urls from a local file (like gw's "&filelist") with the "URL File" or "Page File" option depending on whether you want full site walks or just single page fetches, respectively.

Webinator 4 will walk until completion or manual intervention.
Assuming that you wanted to make an incomplete walk live you could modify dowalk in the "fetchset" function. Insert this right before the userstats call:

<if convert( 'now' , 'date' ) gt convert( '2002-01-14 06:00:00' , 'date' )>
Walker stopping by time. ($top)
<$stopwalk=2>
<bye>
</if>
b.sims
Posts: 99
Joined: Fri Oct 26, 2001 10:40 am

Extra download required for Webinator with Texis?

Post by b.sims »

When I make this modification, webinator throws up the error 'Missing start single quote in value'. I used the code exactly as above, where it looks as though all the quotes are correctly paired up; perhaps I am missing something in the syntax?

I ran the crawler over the weekend and manually stopped it this morning; once this function is in place, will that index automatically be made once dowalk is run again?

Thanks a lot,
bart
Posts: 251
Joined: Wed Apr 26, 2000 12:42 am

Extra download required for Webinator with Texis?

Post by bart »

Make sure there are spaces before and after each single quote in a convert statement. This is silly, but the parser is kind of brain dead in this area.

Im not sure about your second question, but you can check it yourself.
b.sims
Posts: 99
Joined: Fri Oct 26, 2001 10:40 am

Extra download required for Webinator with Texis?

Post by b.sims »

Also, does Webinator 4 contain an equivalent of the 2.5 todo table? I would like to be able to customize the crawler so that I can stop and restart a walk, in order to use system slowtime and bandwidth.
b.sims
Posts: 99
Joined: Fri Oct 26, 2001 10:40 am

Extra download required for Webinator with Texis?

Post by b.sims »

OK, I've been going through the code, please tell me if I am right about all this:

System regularly checks the value of $stopwalk. This can have a value of 0, 1 or 2. 1 means the walk was stopped by the system, 2 means it was stopped manually by the admin. As far as I can tell, stopping manually like this causes an abandon and the page index is not created: is this correct?

Can the indexing process be triggered manually?
User avatar
mark
Site Admin
Posts: 5519
Joined: Tue Apr 25, 2000 6:56 pm

Extra download required for Webinator with Texis?

Post by mark »

Webinator 4 operates under a somewhat different paradigm than webinator 2. It's not currently very amenable to stopping and restarting. But that's a feature want to add. You might find the comments at the top of the dowalk script interesting.

Right. Stopping the walk in the dispatcher abandons it and does not make the index. Where I suggested you place the time based stop will stop the children spawned by the dispatcher. The dispatcher is oblivious to why they quit and will simply assume all is well and index and make the database live.

You could call the "remakeindex" function documented at http://www.thunderstone.com/texis/site/ ... ing+dowalk
b.sims
Posts: 99
Joined: Fri Oct 26, 2001 10:40 am

Extra download required for Webinator with Texis?

Post by b.sims »

Based on what I said in 18, should your code in 14 be corrected to <$stopwalk=1>. My walks are not going live as expected; is this due to 2 being the abandon code?
Post Reply