Returning a default value

Post Reply
tony.malandain
Posts: 57
Joined: Sat Mar 27, 2004 6:39 pm

Returning a default value

Post by tony.malandain »

Hi,
I'm trying to build a SQL query that would use LIKEP to return the record best matching some keywords or fallback to a default record (if it exists) if the keywords cannot be found.
For instance, let's say I have the following table
DocID | PageID | PlainText
id1 | 1 | 'this is page 1'
id1 | 2 | 'this is page 2'
id1 | 3 | 'this is page 3'
id2 | 1 | 'no matching text here'
id2 | 2 | 'no matching text here'

Right now I run the following query
select max($rank), DocId, Abstract from myTable where DocId in ('id1', 'id2', 'id3') and PlainText likep 'page 2' group by DocId which will return only the
record (DocId=id1, pageId=2) because the likep doesn't match any records for id2.

Now how can I modify the query to have it return (id2, pageId=1) too? In other words, if it doesn't find the keywords in the PlainText fields any record with DocId=id2, then it should return the record with pageId=1.
Note that the result should not contain a (DocId=id3, PageId=1) entry because there is no entry with DocId=id3 in the table

Thanks,
/Tony
User avatar
mark
Site Admin
Posts: 5519
Joined: Tue Apr 25, 2000 6:56 pm

Returning a default value

Post by mark »

<sql "give em what they asked for">
</sql>
<if $loop eq 0>
<sql "give em something else">
</sql>
</if>
tony.malandain
Posts: 57
Joined: Sat Mar 27, 2004 6:39 pm

Returning a default value

Post by tony.malandain »

Thanks for the reply Mark.

I realize I could run 2 queries to combine the results, but I'm looking for a more efficient way of doing it. I'm afraid running 2 texis queries for every search query would impact the performance of the server.
Is there a way for the likep to return a rank of 0 if the keywords are not found but the record exist?
I'm assuming a value of 0 would make the record show up in the search result at least. Is that correct?
User avatar
John
Site Admin
Posts: 2622
Joined: Mon Apr 24, 2000 3:18 pm
Location: Cleveland, OH
Contact:

Returning a default value

Post by John »

No, the index is used to only return records with a rank exceeding a threshold. Having a 0 threshold would mean every record, which would be much more inefficient than finding the docids that did not have a match and doing a second query to return those, or in your case it looks like you can just assume PageID = 1 so it wouldn't need the second query?
John Turnbull
Thunderstone Software
tony.malandain
Posts: 57
Joined: Sat Mar 27, 2004 6:39 pm

Returning a default value

Post by tony.malandain »

Unfortunately, I cannot as the caller may provide DocIds for which I don't have any record.
Assuming PageID=1 would mean that I have a document with the specified DocId which is not always the case.
User avatar
John
Site Admin
Posts: 2622
Joined: Mon Apr 24, 2000 3:18 pm
Location: Cleveland, OH
Contact:

Returning a default value

Post by John »

A second query of

... where DocID in $nonmatchingdocids and PageID = 1

should be pretty quick with an index on DocID, PageID.
John Turnbull
Thunderstone Software
tony.malandain
Posts: 57
Joined: Sat Mar 27, 2004 6:39 pm

Returning a default value

Post by tony.malandain »

An alternative I have in mind is to do add an OR operator in the where clause:
where (DocId in ('id1', 'id2', 'id3') and PlainText likep 'page 2') or (DocID in ('id1', 'id2', 'id3') and PageID=1)

The problem here is that the 2nd test returns a perfect match. Is there a way to can apply weights on the 2 tests so that test1 will always win over test2?

-----------
Let me be more specific: By perfect match I mean that even if the keywords exist in PlainText, the record with PageID=1 will always be returned as it scores 1000 for $rank where the FT search never goes higher than 800.
User avatar
John
Site Admin
Posts: 2622
Joined: Mon Apr 24, 2000 3:18 pm
Location: Cleveland, OH
Contact:

Returning a default value

Post by John »

Not really, as you say that clause is a perfect match.
John Turnbull
Thunderstone Software
Post Reply