I understand that refresh will only refresh what is already in the database BUT if I add some exclusions and exclude REX after I pause the walk, would it stop refreshing what is already in the database and would it stop bringing similar urls to what is in the existing database? I missed one of the sorttypes to exclude when crawling a website and now that mistake is costing me some 18,000 pages and the refresh walk keeps bringing more pages even after I added the exclusions.
Also, how can I add url patterns that are not directory structures but are query patterns
instead of http://www.somesite.com/somedir/*
I want to add, http://www.somesite.com?somequery?somevar=*
If I have multiple categories and if you can suggest some way to accomplish the query pattern, then what happens to urls that match none of the categories???
Also, how can I add url patterns that are not directory structures but are query patterns
instead of http://www.somesite.com/somedir/*
I want to add, http://www.somesite.com?somequery?somevar=*
If I have multiple categories and if you can suggest some way to accomplish the query pattern, then what happens to urls that match none of the categories???