this page contains numerous links that start with the following path, i would like to include these in the index. what walk setting should i use. i've tried Extra URLs REX, but i cannot get it to work.
I didn't want to include it as another base url because I don't want it crawl the actual web pages starting at the root of "Public". My base page contains numerous links to actual files,which is what I want to include.
I added your expression, it doesn't seem to be working. The links I want indexed show up as children, but they are not being indexed.
Turn verbosity up to 4. Do a new walk with rewalk type set to new. Then the child links under list/edit urls should indicate why they were rejected. Give an actual example url that's not getting indexed.
Sorry, I had my blinders on. Setting extra urls isn't the way to do what you want. You need to put http://leagis:8080/Public/
in the base url. Then in "exclude by field" enter a query of
/http://leagis:8080/Public/=>>=
for field "URL" and exclude "Pages and Links". That will tell it that site leagis:8080 is acceptable but that the particular page http://leagis:8080/Public/ isn't and shouldn't be followed.
After applying the REX express in #2 above, I later noticed that I was getting on "out of memory" error. I found a reference to that on this knowledge base, which said to restart the walking. I did that, and now it is working as expected.