Page 1 of 1
exclusion Pattern
Posted: Thu Apr 06, 2006 4:12 am
by rajesh11
i want to ignore following pattern
http://www.verkkouutiset.fi/arkisto/talous/[any Number].html
I provided
http://www.verkkouutiset.fi/arkisto/
~
?
but it's still crawling it.
exclusion Pattern
Posted: Thu Apr 06, 2006 10:41 am
by mark
A little more info please. What's your base url and other non-default settings?
Is this a new or refresh walk?
What version of scripts are you using?
exclusion Pattern
Posted: Thu Apr 06, 2006 11:03 am
by rajesh11
base url is
http://www.verkkouutiset.fi/
it is refresh walk
version : 2.406
Extension : .html .htm .txt .pdf .doc .swf .php .jsp .cfm .jhtml .shtml .dhtml .asp .pl .cgi .phtml
Parallelism : 3 - 3
Verbosity :5
Strip Queries : N
exclusion Pattern
Posted: Thu Apr 06, 2006 11:38 am
by mark
refresh will keep the urls that it already has regardless of rule changes. Use list/edit urls to find and delete those urls from the database.
Also, for 1 base url servers should be set to 1. 3 won't hurt but will make it use slightly more cpu.
Not sure what "2.406" is. The version number is in the top right of most pages of the dowalk admin interface. Or if you look in the script it's in the <$version= line.