Question re limiting what links are walked...

Post Reply
legedza.henry
Posts: 142
Joined: Wed Jul 24, 2002 11:52 pm

Question re limiting what links are walked...

Post by legedza.henry »

I want to walk a series of different different domains most of which link to documents in our document management system which has an address in the form: www.site.edu.au/dms/docman/.....

I have put in the various individual domains in the BASE URL and have limited Webinator to STAY UNDER each of those domains.

What do I need to do for webinator to index any links which goto the dms site so that all I end up with is the indexed content for each domain plus any links to the dms also indexed.

Thanks
Henry
User avatar
mark
Site Admin
Posts: 5519
Joined: Tue Apr 25, 2000 6:56 pm

Question re limiting what links are walked...

Post by mark »

Add www.site.edu.au to "Extra Domains" or add
>>=http://www\.site\.edu\.au/dms/docman/
to "Extra URLs REX".
legedza.henry
Posts: 142
Joined: Wed Jul 24, 2002 11:52 pm

Question re limiting what links are walked...

Post by legedza.henry »

I want to index this page http://www.decs.sa.gov.au/policy/pages/ ... icy_index/

When I set it to STAY UNDER with your EXTRA URLS Rex set to our document management system which is www.decs.sa.gov.au/docs it only indexes about 10 pages.

When I turn stay under off it proceeds to index everything.

All I want is to index the links in the A-Z section - I don't want it to wander off into different areas like Home About DECS, STAFF Info etc and start walking them
User avatar
jason112
Site Admin
Posts: 347
Joined: Tue Oct 26, 2004 5:35 pm

Question re limiting what links are walked...

Post by jason112 »

If you do the walk with a high verbosity (4) and then look at your Base URL in List/Edit URLs and click the "Children" link, it should list all the child URLs with the reasons that they weren't walked.

Glancing at a few of the URLs, it appears that they are under many differnet prefixes/sites.

http://www.decs.sa.gov.au/docs/documents/...
http://www.drugstrategy.sa.edu.au/aboutdrugstrategy/...
http://www.decs.sa.gov.au/docs/files/...

If these documents are the only things you're indexing for this profile, you could set "Offsite Pages" to "Y" and "Max Depth" to 1. That would have it "wander" only 1 page into the other links you mentioned ("About DECS", "Staff", etc).
Post Reply