Page 4 of 5

Ignoring robots.txt file

Posted: Fri Aug 01, 2003 3:24 pm
by sunnedaze
Nothing like that. Just what I copied above

Ignoring robots.txt file

Posted: Fri Aug 01, 2003 3:48 pm
by mark
Does that page get skipped if you add
http://cleohsenet01.napa.ad.etn.com/wlc050403
to the "Exclusion Prefix" under all walk settings and do a new walk?

Ignoring robots.txt file

Posted: Fri Aug 01, 2003 4:16 pm
by sunnedaze
I put /wlc050403 in the exclusion prefix section, did a new walk & yes, the site came up when I did a search.

Ignoring robots.txt file

Posted: Fri Aug 01, 2003 4:19 pm
by mark
In the "Exclusion Prefix" box you have to enter the entire url prefix, like I gave it. You can enter just
/wlc050403
in the plain "Exclusions" box though.

Ignoring robots.txt file

Posted: Fri Aug 01, 2003 4:24 pm
by sunnedaze
Sorry...I entered it in the exclusions box /wlc050403. I don't see the "exclusions prefix" box (looking in the basic walk settings for that profile)?

Ignoring robots.txt file

Posted: Fri Aug 01, 2003 4:43 pm
by mark
It's under "all walk settings" as mentioned previously.

Ignoring robots.txt file

Posted: Mon Aug 04, 2003 9:33 am
by sunnedaze
Yes, that worked. The site did not come up in the search. What next?

Ignoring robots.txt file

Posted: Mon Aug 04, 2003 11:02 am
by mark
Wasn't not coming up the desired goal?

If the exclusion works in "Exclusion Prefix" there must be something odd about your robots.txt file since that's where the robots.txt rules are placed during the walk. Without being able to fetch it from here it's hard to say what.

In your installation directory there should be a program called "geturl.exe". From a command prompt run
geturl http://cleohsenet01.napa.ad.etn.com/robots.txt
and paste the full results here.

Ignoring robots.txt file

Posted: Tue Aug 05, 2003 12:39 pm
by sunnedaze
I don't seem to have access to a prompt & when I run this at the Windows2000 'run' command (off the start menu), the display flashes by too quickly to be read.
F:/Thunderstone Software/Webinator/geturl.exe http://cleohsenet01.napa.ad.etn.com. Have tried typing it several ways. This way doen't give me the 'can't find components' message, but no output either.

Ignoring robots.txt file

Posted: Tue Aug 05, 2003 1:54 pm
by mark
find and click the MSDOS icon to get a prompt.