Page 1 of 1

problem walking with url url.

Posted: Mon Mar 24, 2008 11:29 am
by jgdoke
The only change I made was to include a url url that lists all the pages on my sites. After 16 hours the crawl had indexed less than 700 pages. Normal walk finishes with 11,000 pages in 18 hours. Could the size of my url file be the issue? there are 15,000 lines. I stopped the walk and all the walk data is gone. It shows the previous walk.

problem walking with url url.

Posted: Mon Mar 24, 2008 12:07 pm
by mark
Not sure why it would be slow, but if you want to list all the pages instead of having the crawler discover them you should use page url instead of url url.

Yes, hitting "Stop" abandons the currently running walk. You can see the status for the abandoned walk from the Maintenance->Manage Logs->Profile Walk Status area. Use "Pause and live" to stop a walk and keep the data.

problem walking with url url.

Posted: Mon Mar 24, 2008 12:33 pm
by jgdoke
using Page url means that hyperlinks found on the pages will not be followed. I want them to be followed.

Is there somthing else I am missing that using one vs the other?

problem walking with url url.

Posted: Mon Mar 24, 2008 1:45 pm
by mark
I'm not sure why you'd need to follow links if you "list all the pages on your sites"?

problem walking with url url.

Posted: Mon Mar 24, 2008 3:05 pm
by jgdoke
All current pages. this list took two days to come up with.. the appliance crawls most of the site but a lot of our links are in a javascript menu that the appliance does not seem to parse. I don't know why as I have the javascript stuff turned on..

So this is just a help to find more pages. I would like the appliance to find these on it's own but it doesn't.

problem walking with url url.

Posted: Mon Mar 24, 2008 4:44 pm
by mark
The large list of base urls may or may not be the issue. Please open a ticket and send your all walk settings page, walk status page, and tech support info page.