Limit to Number of Base URL's?

dhunter65
Posts: 15
Joined: Sun Apr 13, 2003 5:33 pm

Limit to Number of Base URL's?

Post by dhunter65 »

Is there a limit to the nubmer of base urls (or extra domains) that you can add to a profile? I have 5 base url's and 5 extra domains and everything seems fine. I added a sixth url and it's not being indexed. I get no results when I do a test search or a live search for the sixth url.
dhunter65
Posts: 15
Joined: Sun Apr 13, 2003 5:33 pm

Limit to Number of Base URL's?

Post by dhunter65 »

After further examiniation, that's not the problem at all. I've removed all but the last url from base url's. I then spider the site. But I get no results. I know it doesn't have anything to do with the robots.txt file because it's identical to the robots.txt file of the other sites that are being indexed properly. Why would I get no results for this site?
User avatar
mark
Site Admin
Posts: 5519
Joined: Tue Apr 25, 2000 6:56 pm

Limit to Number of Base URL's?

Post by mark »

Look at the walk status page to see what it says about that site. Go to list/edit urls to see what urls are found/walked. Turn verbosity up to 4 and redo the walk to aid in diagnostics.

And there's no particular limit on the number of base urls or domains.
dhunter65
Posts: 15
Joined: Sun Apr 13, 2003 5:33 pm

Limit to Number of Base URL's?

Post by dhunter65 »

Nothing seems to be happening. Here are the results (URLs changed for privacy):

Webinator Walk Report for TM

Creating database d:\MORPH3/texis/TM/walk...Done.
Walk started at 2003-07-31 13:06:43 (by user)
Verbosity set to 4
JavaScript walking enabled
HTTPS walking disabled
Start fetching at http://www.pppp.org
http://www.pppp.org
Ignore urls containing any of the following:
/cgi-bin/
~
/secure/

Background: D:\inetpub\wwwroot\scripts\texis.exe -r profile="TM" top="http%3A//www.pppp.org" "d:\MORPH3/webinator/dowalk\start.txt"
started 1 (2280) on http://www.pppp.org
0 errors
0 duplicate pages
No pages fetched. Search not updated.
dhunter65
Posts: 15
Joined: Sun Apr 13, 2003 5:33 pm

Limit to Number of Base URL's?

Post by dhunter65 »

Walk Settings:

Base URL = http://www.pppp.org
Enterprise - checked + http://www.pppp.org
Extra Domain = pppp.org

List/Edit URLs - When submitting "*" as the criteria, I get "0 matching pages" as a result.
User avatar
mark
Site Admin
Posts: 5519
Joined: Tue Apr 25, 2000 6:56 pm

Limit to Number of Base URL's?

Post by mark »

All I can guess is there's something odd about your base page. But I can't tell without knowing what it actually is. You can make your message private so that only Thunderstone can see it.

Your enterprise setting is incorrect. It should be "pppp.org" and you don't need to specify it again in extra domains.

You can try the "geturl" function of dowalk to see what it thinks about it. See http://www.thunderstone.com/texis/site/ ... the+Walker
and http://www.thunderstone.com/texis/site/ ... ing+dowalk
dhunter65
Posts: 15
Joined: Sun Apr 13, 2003 5:33 pm

Limit to Number of Base URL's?

Post by dhunter65 »

There's quite a bit of JavaScript on this page, but I've run this profile with all of the JavaScript options turned on and I've run it with all of the JavaScript options turned off...same results.

The page is http://www.ppasny.org. Please take a look and see if there might be a reason that webinator can't index this page.

Thanks
User avatar
mark
Site Admin
Posts: 5519
Joined: Tue Apr 25, 2000 6:56 pm

Limit to Number of Base URL's?

Post by mark »

I'm able to walk that site fine using defaults plus the settings you suggested plus adding .asp to the extensions list. Try adding .asp. If you still have problems try the geturl suggested above or the simpler geturl command
INSTALLDIR\geturl http://www.ppasny.org

Maybe the machine you're doing the walk from doesn't know the server by that name? Can a browser running on the same machine as webinator access the site? Does it need a proxy to do so?
dhunter65
Posts: 15
Joined: Sun Apr 13, 2003 5:33 pm

Limit to Number of Base URL's?

Post by dhunter65 »

Yes...I can browse to the site from the Webinator server.

I think I figured out the problem. I have several DOS prompt windows open running TSQL so I can query the various DB's. I don't think webinator was able to overwrite a db when I re-ran a profile. So I closed the DOS windows that were connected to DB1, DB2, & WALK. Now it seems to be walking properly.

I think I understand the purpose of DB1 & DB2, but what is the WALK db there for?

Thanks
User avatar
mark
Site Admin
Posts: 5519
Joined: Tue Apr 25, 2000 6:56 pm

Limit to Number of Base URL's?

Post by mark »

Don't know since the standard Webinator scripts don't create any database called "walk".
Post Reply