Urls list and -j

Post Reply
User avatar
Thunderstone
Site Admin
Posts: 2504
Joined: Wed Jun 07, 2000 6:20 pm

Urls list and -j

Post by Thunderstone »



To the Thunderstone staff: Happy holidays. Now, on to the question.

I am indexing customer sites with Webinator and full texis. For the most
part, these sites are located in accounts on ISPs all over the place. I
would like to walk their sites only without following links to their friends
on the same ISP.

Eg. http://www.momandpop-isp.net/~joes/
Visit mah buddy Herbs's page at http://www.momandpop-isp/~herb/

I want to walk a series of urls that I specify in "Urls.txt" while staying
within the directory specified in the URL:

Something like:

gw -dsites -j"&Url.txt" "&Url.txt"


Can I do this with Webinator? I suppose with texis and webscript I could

<sql "select Urls from ATable"></sql>
<loop $Urls>
<exec gw -j$Urls $Urls>
</exec>
</loop>

--- but it seems like gw would start, index, stop, start, index, stop...


Any suggestions?


Steve Ferda







User avatar
Thunderstone
Site Admin
Posts: 2504
Joined: Wed Jun 07, 2000 6:20 pm

Urls list and -j

Post by Thunderstone »



You can make an options file with the j options (no leading or trailing spaces):
jhttp://www.momandpop-isp.net/~joes/
...
then use something like:
gw -dsites -mUrlj.txt "&Url.txt"

Your alternate approach is also ok if you just add -noindex to
each run then run with -index at the end.
gw -noindex -jurl1 url1
gw -noindex -jurl2 url2
gw -index



Post Reply