Ignoring robots.txt file

sunnedaze
Posts: 22
Joined: Mon Jul 28, 2003 2:07 pm

Ignoring robots.txt file

Post by sunnedaze »

I guess I don't know how to do that test. I typed it at the RUN prompt (I am on NT) and there was a too quick to catch display.

texis "profile=allenet" top="cleohsenet01.napa.adpa.ad.etn.com" ttyverbose=1 dowalk/getrobots.txt
User avatar
mark
Site Admin
Posts: 5519
Joined: Tue Apr 25, 2000 6:56 pm

Ignoring robots.txt file

Post by mark »

And an http url requires the leading "http://"
sunnedaze
Posts: 22
Joined: Mon Jul 28, 2003 2:07 pm

Ignoring robots.txt file

Post by sunnedaze »

texis "profile=allenet" top="cleohsenet01.napa.adpa.ad.etn.com" ttyverbose=1 dowalk/getrobots.txt
texis is not recognized as an internal or external command, operable program or batch file
User avatar
mark
Site Admin
Posts: 5519
Joined: Tue Apr 25, 2000 6:56 pm

Ignoring robots.txt file

Post by mark »

sunnedaze
Posts: 22
Joined: Mon Jul 28, 2003 2:07 pm

Ignoring robots.txt file

Post by sunnedaze »

Sorry to be such a dummy. We have the dowalk script installed at F:/Thunderstone Software/Webinator/Texis/Scripts/Webinator.

We have the PROFILENAME installed at F:/Thunderstone Software/Webinator/Texis

Doing a search on the Document Root (URL of website), I'm not coming up with anything.

Tried running texis profile=PROFILENAME" dowalk/dispatch.txt at both prompts & same message....texis is not recognized, etc.
User avatar
mark
Site Admin
Posts: 5519
Joined: Tue Apr 25, 2000 6:56 pm

Ignoring robots.txt file

Post by mark »

F:\Thunderstone Software\Webinator\Texis is not the profile name. The profile name is the name you gave your webinator walk profile when you created it in the admin interface.

You should find texis.exe in your installation directory.

F:\Thunderstone Software\Webinator\Texis\texis profile=allenet top=http://cleohsenet01.napa.adpa.ad.etn.com "F:/Thunderstone Software/Webinator/Texis/Scripts/Webinator/dowalk/getrobots.txt"

or to shorten it up cd to the install directory first

cd F:\Thunderstone Software\Webinator\Texis
texis profile=allenet top=http://cleohsenet01.napa.adpa.ad.etn.com Scripts/Webinator/dowalk/getrobots.txt
sunnedaze
Posts: 22
Joined: Mon Jul 28, 2003 2:07 pm

Ignoring robots.txt file

Post by sunnedaze »

I got a listing of a bunch of texis options & then back to the prompt. Viewing the walk log via the browser shows nothing was updated.
User avatar
mark
Site Admin
Posts: 5519
Joined: Tue Apr 25, 2000 6:56 pm

Ignoring robots.txt file

Post by mark »

Nevermind, it appears there was a potential problem parsing robots.txt on windows. Download the latest dowalk and webinatoradmin from the webinator examples page for a fix.
sunnedaze
Posts: 22
Joined: Mon Jul 28, 2003 2:07 pm

Ignoring robots.txt file

Post by sunnedaze »

Installed the 2 scripts. Ran a new walk. Still no robots.txt file in the log. (I did not reboot after the install.)
Creating database F:\Thunderstone Software\Webinator/texis/allenet/db2...Done.
Walk started at 2003-08-01 11:09:24 (by user)
JavaScript walking not enabled by current license
HTTPS walking disabled
Start fetching at http://cleohsenet01.napa.ad.etn.com/
Reading urls from file F:\iPlanet\Servers\docs\enet\html\eneturl.txt
Ignore urls containing any of the following:
/cgi-bin/
~
?

started 1 (3744) on http://cleohsenet01.napa.ad.etn.com/
6118 pages (135,234,774 bytes) so far.
223 errors so far.
24 duplicate pages so far.
sunnedaze
Posts: 22
Joined: Mon Jul 28, 2003 2:07 pm

Ignoring robots.txt file

Post by sunnedaze »

Could be the file is named wrong? The new script files are .txt. The original ones are properties=file.
Post Reply