Page 1 of 2
Limit to Number of Base URL's?
Posted: Thu Jul 31, 2003 11:59 am
by dhunter65
Is there a limit to the nubmer of base urls (or extra domains) that you can add to a profile? I have 5 base url's and 5 extra domains and everything seems fine. I added a sixth url and it's not being indexed. I get no results when I do a test search or a live search for the sixth url.
Limit to Number of Base URL's?
Posted: Thu Jul 31, 2003 12:15 pm
by dhunter65
After further examiniation, that's not the problem at all. I've removed all but the last url from base url's. I then spider the site. But I get no results. I know it doesn't have anything to do with the robots.txt file because it's identical to the robots.txt file of the other sites that are being indexed properly. Why would I get no results for this site?
Limit to Number of Base URL's?
Posted: Thu Jul 31, 2003 12:30 pm
by mark
Look at the walk status page to see what it says about that site. Go to list/edit urls to see what urls are found/walked. Turn verbosity up to 4 and redo the walk to aid in diagnostics.
And there's no particular limit on the number of base urls or domains.
Limit to Number of Base URL's?
Posted: Thu Jul 31, 2003 1:11 pm
by dhunter65
Nothing seems to be happening. Here are the results (URLs changed for privacy):
Webinator Walk Report for TM
Creating database d:\MORPH3/texis/TM/walk...Done.
Walk started at 2003-07-31 13:06:43 (by user)
Verbosity set to 4
JavaScript walking enabled
HTTPS walking disabled
Start fetching at
http://www.pppp.org
http://www.pppp.org
Ignore urls containing any of the following:
/cgi-bin/
~
/secure/
Background: D:\inetpub\wwwroot\scripts\texis.exe -r profile="TM" top="http%3A//
www.pppp.org" "d:\MORPH3/webinator/dowalk\start.txt"
started 1 (2280) on
http://www.pppp.org
0 errors
0 duplicate pages
No pages fetched. Search not updated.
Limit to Number of Base URL's?
Posted: Thu Jul 31, 2003 1:16 pm
by dhunter65
Walk Settings:
Base URL =
http://www.pppp.org
Enterprise - checked +
http://www.pppp.org
Extra Domain = pppp.org
List/Edit URLs - When submitting "*" as the criteria, I get "0 matching pages" as a result.
Limit to Number of Base URL's?
Posted: Thu Jul 31, 2003 1:55 pm
by mark
All I can guess is there's something odd about your base page. But I can't tell without knowing what it actually is. You can make your message private so that only Thunderstone can see it.
Your enterprise setting is incorrect. It should be "pppp.org" and you don't need to specify it again in extra domains.
You can try the "geturl" function of dowalk to see what it thinks about it. See
http://www.thunderstone.com/texis/site/ ... the+Walker
and
http://www.thunderstone.com/texis/site/ ... ing+dowalk
Limit to Number of Base URL's?
Posted: Thu Jul 31, 2003 2:31 pm
by dhunter65
There's quite a bit of JavaScript on this page, but I've run this profile with all of the JavaScript options turned on and I've run it with all of the JavaScript options turned off...same results.
The page is
http://www.ppasny.org. Please take a look and see if there might be a reason that webinator can't index this page.
Thanks
Limit to Number of Base URL's?
Posted: Thu Jul 31, 2003 3:19 pm
by mark
I'm able to walk that site fine using defaults plus the settings you suggested plus adding .asp to the extensions list. Try adding .asp. If you still have problems try the geturl suggested above or the simpler geturl command
INSTALLDIR\geturl
http://www.ppasny.org
Maybe the machine you're doing the walk from doesn't know the server by that name? Can a browser running on the same machine as webinator access the site? Does it need a proxy to do so?
Limit to Number of Base URL's?
Posted: Thu Jul 31, 2003 3:46 pm
by dhunter65
Yes...I can browse to the site from the Webinator server.
I think I figured out the problem. I have several DOS prompt windows open running TSQL so I can query the various DB's. I don't think webinator was able to overwrite a db when I re-ran a profile. So I closed the DOS windows that were connected to DB1, DB2, & WALK. Now it seems to be walking properly.
I think I understand the purpose of DB1 & DB2, but what is the WALK db there for?
Thanks
Limit to Number of Base URL's?
Posted: Thu Jul 31, 2003 3:55 pm
by mark
Don't know since the standard Webinator scripts don't create any database called "walk".