Multiple profiles

Post Reply
roberto_george
Posts: 13
Joined: Fri Jun 08, 2001 1:54 pm

Multiple profiles

Post by roberto_george »

1. I have a number of links that I want to index and everyone of this links has some options for crawling .. there is any way to set this options for crawler without using multiple profiles and without using the administration interface ? I mean .. I have all this options into an external MySQL database (along with the urls that I want to be crawled) and I wish to manual start the crawler with the options that I got from MySQL db.
Yes ... I know you administration script is very well done but I want to give access for my users only to change the options associated with the url(s) that they own and keep your admin interface away from them.

2. It is true that running a walk over a site will increase the number of hits used ? I'm not sure if in the previous version using gw this thing happened or not?

3. The vhttp server is included only with a full texis license or can be purchased separate.

4. how do I set (in the administration interface) the (default) options for the default profile ?

5. In the case that am I using multiple profiles how can I run a query against some of them (or all of them)?

Sorry for that many questions :)
Thank you.
User avatar
mark
Site Admin
Posts: 5519
Joined: Tue Apr 25, 2000 6:56 pm

Multiple profiles

Post by mark »

1. To index a bunch of sites each with different options into the same database you'll pretty much have to use the Webinator 2.5 program gw that is included with webinator 4.
The dowalk script could be modified to do what you want, but it would be a non-trivial process.

2. Each page view on the admin interface will count as a license hit. That's not generally a problem unless you have a long running walk and leave the walk status screen up the whole time auto-refreshing every 10 seconds.

3. vhttpd is only available with full Texis on unix(non-windows) platforms.

4. the dowalk script contains all of the defaults. more commonly you would just make a profile to use as a template with your desired defaults, then when creating a new profile make it a copy of your template profile instead of '- defaults -".

5. In the search script you can switch databases between queries:
<db=firstdb>
<dothesearch>
<db=seconddb>
<dothesearch>
you would have to write the function "dothesearch" by extracting the search <sql> loop from the existing "search" function.
Or you could do a meta search on it. See http://www.thunderstone.com/texis/site/demos/meta/ for a generic example of metasearching.
roberto_george
Posts: 13
Joined: Fri Jun 08, 2001 1:54 pm

Multiple profiles

Post by roberto_george »

1. Webinator 2.5 program gw will be suported in the next release ? If not, can you be more specific about doing the same thing but using dowalk script ? I wish to know how dowalk script transfer the options to the spider in Webinator version 4.
roberto_george
Posts: 13
Joined: Fri Jun 08, 2001 1:54 pm

Multiple profiles

Post by roberto_george »

2. Yes .. I know that every page view in the admin interface is counted as a license hit but after I realize that I made a test closing all admin interfaces that were opened and i gave one "texis -license" command and another one after some time (30-60 seconds) and I see the numbers of license hits incremented. I did this test more than one time and I'm sure that nobody is using this server for search or other kind of use that will use hits (it's a development server). The only activity that was on this server was the spidering process.
roberto_george
Posts: 13
Joined: Fri Jun 08, 2001 1:54 pm

Multiple profiles

Post by roberto_george »

5. ok .. but in this case the result set will contain the records from both databases but ordered based on a ranking that go across both databases ? As far as I know the ranking is based on the curent result set that was generated by the current SQL query.

what I want is to show the results as comming from a single query against a single db not as successive queries against multiple databases as in your "meta search" example.

Thank you
User avatar
mark
Site Admin
Posts: 5519
Joined: Tue Apr 25, 2000 6:56 pm

Multiple profiles

Post by mark »

gw will be around for a while now. and we'll be improving the command line drivability of dowalk. dowalk does not transfer anything to gw. dowalk IS the version 4 walker (spider).

dowalk launches a new instance to handle each distinct site while walking. So if you're using enterprise, extra domains, or base urls from multiple sites it will cause a license hit for each one.

In either case, self query or meta search, you would have to collect the answers from all of the databases, then <sort> them based on rank. For you application it's probably better to keep them all in one database and walk with gw instead of dowalk.
Post Reply