I am trying to find WORD1 within 2 words of WORD2. my query looks like..
$tsql "set withinmode='word';set indexwithin=1;select * from tbltest where author like 'Income w/6 Company"
ID author
------------+------------+
1 Income produced by edible lipgloss brought the company ,000,000 last year.
If i change the proximity buffer from 4 to 2 (ie: Income w/2 Company), i still get the same hit, even though Company is not within TWO words of Income. I am sure I am doing something wrong. Please help!
Which version of tsql are you using? It should work in tsql with the set withinmode='word'. The following works for me:
create table tbltest(ID int,author varchar(20));
insert into tbltest values(1,'Income produced by edible lipgloss brought the company ,000,000 last year.');
create metamorph inverted index idxmauthor on tbltest(author);
set withinmode='word';
set indexwithin=1;
select * from tbltest where author like 'income company w/1';
select * from tbltest where author like 'income company w/2';
select * from tbltest where author like 'income company w/3';
select * from tbltest where author like 'income company w/4';
select * from tbltest where author like 'income company w/5';
select * from tbltest where author like 'income company w/6';
select * from tbltest where author like 'income company w/7';
drop table tbltest;
Thank you.
I am using 05.01.1138031466(20060123) version.
I tried using your tsql statements and they worked like a charm. It returned hits for w/4 or higher.
Then I added a few index expressions to the create index and ran the script again. This time I got hits for w/2 or higher. I narrowed it down to the expression set addexp='\alnum{1,99}'. It appears that if I remove \alnum{1,99} from my create index statement, then I get the right hits. With this expression added to the create index statement, hits are being returned for w/2 - which is obviously erroneous. Is this an expected behavior?
drop table tbltest;
create table tbltest(ID int,author varchar(20));
insert into tbltest values(1,'Income produced by edible lipgloss brought the company ,000,000 last year.');
set keepnoise='on';set delexp=0;set addexp='\punct{1,5}';set addexp='\alnum{1,99}';set addexp='>>\alpha{1,50},=\alph
a{1,50}';create metamorph inverted index idxmauthor on tbltest(author);
set withinmode='word';
set indexwithin=1;
select * from tbltest where author like 'income company w/1';
select * from tbltest where author like 'income company w/2';
select * from tbltest where author like 'income company w/3';
select * from tbltest where author like 'income company w/4';
select * from tbltest where author like 'income company w/5';
select * from tbltest where author like 'income company w/6';
select * from tbltest where author like 'income company w/7';
John, the updated version that you sent us seems to have resolved the anomaly. Thank you!
I have one more question about the withinmode settings. Sometimes it appears that some of the words within the proximity buffer are being ignored.
For example,
'Income w/4 Company' finds
"Income produced by edible lipgloss brought the company ,000,000 last year."
There are 6 words between Income and Company. It appears that 'by' and 'the' are being ignored. Does this behavior have anything to do with noise words?
It's not due to noise words (they always count for w/N), but the fact that linear w/N search currently matches words up to 2N left and 2N right of the anchor word, instead of N left and N right. Index search (activated with set indexwithin=7) may do the same, but usually is limited to N left and N right as expected.