SQL with Word Forms

Post Reply
dao
Posts: 31
Joined: Fri Apr 12, 2002 2:26 pm

SQL with Word Forms

Post by dao »

Hi,

I have this setup:

in the HTML table, I have very short documents with words and phrases like the following

Document 1 Body: adapt
Document 2 Body: adaptation
Document 3 Body: Adaptive controllers
...

I want to read in a word like "adaptive" and then use the wordforms capability of metamorph to find all documents that contain *only* variants of adaptive.

That means, from the table above, I only want to find document 1 and 2, since they contain *only* variants of my query? Document 3 contains the variant but it also contains other words.

If I use the following select statement, I get back all three documents:

"Select Body from HTML where Body Like adaptive"

Is there a way to take advantage of the word forms processing yet limit to searches to Bodies that are exact matches of the variants?

Thanks

dao@mit.edu
User avatar
mark
Site Admin
Posts: 5519
Joined: Tue Apr 25, 2000 6:56 pm

SQL with Word Forms

Post by mark »

Are there any predictable constraints on the Body field? Such as will it always contain only keywords and never extra leading or trailing spaces or punctuation. Or is it random text?
User avatar
John
Site Admin
Posts: 2622
Joined: Mon Apr 24, 2000 3:18 pm
Location: Cleveland, OH
Contact:

SQL with Word Forms

Post by John »

One possible approach would be to do the query in both directions:

select Body from HTML where Body like 'adaptive' and 'adaptive' like Body;

which would be appropriate if the number of matches to the first part were relatively small.
John Turnbull
Thunderstone Software
dao
Posts: 31
Joined: Fri Apr 12, 2002 2:26 pm

SQL with Word Forms

Post by dao »

Answer to Mark:

The content is predictably a keyword or keyphrase with no leading spaces or trailing space. There may be punctuations: commas, quotations and parentheses.
dao
Posts: 31
Joined: Fri Apr 12, 2002 2:26 pm

SQL with Word Forms

Post by dao »

Answer to John,

Very cool possible solution. The matches to the first part is usually around 100 to 200.
User avatar
mark
Site Admin
Posts: 5519
Joined: Tue Apr 25, 2000 6:56 pm

SQL with Word Forms

Post by mark »

My idea wouldn't work anyhow. John's idea should work pretty well.
Post Reply