retrieving data from two databases and presenting results as if from one

Post Reply
jsoltys
Posts: 14
Joined: Wed Feb 06, 2002 6:54 pm

retrieving data from two databases and presenting results as if from one

Post by jsoltys »

I've got two large databases that I need to retrieve data from based on user queries. Once I get the data back I need to order according to relevance or date and present it to the user as though it came from the same source. To make matters worse the user will be able to specify how many results are shown per page and page through them X at a time.

My first thoughts were to retrieve all the data from each and then sort them together, but this raises an interesting question. Will rankings from a likep query be only valid within a result set? In other words, if I get a match with a ranking of 780 from one database will a ranking of 790 from the other database imply that it is more relevant?

Is anyone out there doing this sort of job? I'm wondering if it wouldn't be more efficient to distill the content and put it into a third database with pointers to the source database. Then I could just hit the aggregate db and only have to bother the source db when someone wanted to view a match. Of course this will lose some of the accuracy which is something much prized for this application.

Any suggestions?
User avatar
John
Site Admin
Posts: 2622
Joined: Mon Apr 24, 2000 3:18 pm
Location: Cleveland, OH
Contact:

retrieving data from two databases and presenting results as if from one

Post by John »

You could just retrieve X from each database, and then sort them in Vortex, and present the next set. You will of course need to remember how many results you presented from each database to skip correctly.

Likep rankings are not necessarily compatible across databases, in that the same document might not rank the same in each database. That will occur if the words in the query occur with different frequencies. You could turn liketblfreq to 0, which would counteract that effect. It may also be that it is appropriate to keep the default settings, depending on the nature of the two databases.

I'm not sure why you'd lose accuracy in one database. It will generally be more efficient to have one database, although beyond a certain size it will make little difference.
John Turnbull
Thunderstone Software
Post Reply