Page 1 of 2

Proximity search question

Posted: Mon Mar 16, 2015 4:49 pm
by rabbott
Hi,
I have a question about proximity searching, specifically how words are counted.

Using Texis Version 05.01.1154642055 here.

I created a table and inserted a row:

create table tbltest(ID int,author varchar(50));
insert into tbltest values(1,'Denise Morse-Rothwell Lisa Mealing Michelle Marie Martin Nanette P Drake');
create metamorph inverted index idxmauthor on tbltest(author);
set withinmode='word';
set indexwithin=1;

When I run the following search: "select DOCID from tbltest where AUTHOR like 'Lisa w/3 +Nanette", I get one hit. There are four words between Lisa and Nanette. Either I've done something wrong here, or I don't really understand proximity searching (probably both!)

Any help would be appreciated. Thanks.

Proximity search question

Posted: Tue Mar 17, 2015 2:18 pm
by mark
Did you get any warnings for that query? I'm guessing you don't have alwithin and alpostproc on.

Proximity search question

Posted: Tue Mar 17, 2015 2:31 pm
by Kai
alwithin=1 needs to be set to allow the within (`w/') operator.

Also, you should not need to set indexwithin: it is a set of bit flags, not a boolean, and should default to 7 in your version and later, which enables all possible index usage for within. Setting it to 1 disables the index for withinmode word, which is triggering a post-process to resolve the within, which is not done by default (but which issues a warning) and thus the row is not excluded.

Proximity search question

Posted: Tue Mar 17, 2015 2:37 pm
by rabbott
I didn't get any warnings for the previous query, and when I run the following query I still get one hit.

tsql "set withinmode='word';set alwithin=1;select ID from tbltest where AUTHOR like 'Lisa w/3 +Nanette'"

Proximity search question

Posted: Tue Mar 17, 2015 2:52 pm
by mark
Implicitly setting "set indexwithin=7;" should do it for your version.

Proximity search question

Posted: Tue Mar 17, 2015 3:01 pm
by rabbott
I wish I could say that did the trick, but I'm getting the same results. I added "set indexwithin=7" and still got a hit (whether or not I included "set alwithin=1")

Proximity search question

Posted: Tue Mar 17, 2015 3:39 pm
by mark
I'm trying a slightly older version 5 than you. Here's my SQL:
drop table tbltest;
create table tbltest(DOCID int,AUTHOR varchar(50));
insert into tbltest values(1,'Denise Morse-Rothwell Lisa Nanette P Drake');
insert into tbltest values(2,'Denise Morse-Rothwell Lisa Martin Nanette P Drake');
insert into tbltest values(3,'Denise Morse-Rothwell Lisa Marie Martin Nanette P Drake');
insert into tbltest values(4,'Denise Morse-Rothwell Lisa Michelle Marie Martin Nanette P Drake');
insert into tbltest values(5,'Denise Morse-Rothwell Lisa Mealing Michelle Marie Martin Nanette P Drake');
create metamorph inverted index idxmauthor on tbltest(AUTHOR);
set withinmode='word';
set alwithin=1;
set indexwithin=7;
select DOCID,AUTHOR from tbltest where AUTHOR like 'Lisa w/3 +Nanette';

And output:
SQL 1>select DOCID,AUTHOR from tbltest where AUTHOR like 'Lisa w/3 +Nanette';

DOCID AUTHOR
------------+------------+
1 Denise Morse-Rothwell Lisa Nanette P Drake
2 Denise Morse-Rothwell Lisa Martin Nanette P Drake

Note that your version is 8+ years and 2 major revisions old.

Proximity search question

Posted: Tue Mar 17, 2015 4:17 pm
by rabbott
I executed all the statements you ran, and I got 3 hits (DocIds 1, 2, 3). Is it safe to say that this is due to us having an old version of Texis? Any idea what could have changed?

Thanks for all your help on this.

Proximity search question

Posted: Tue Mar 17, 2015 5:12 pm
by mark
The definition of the within radius may have changed to be inclusive in your version vs. my older test.

Proximity search question

Posted: Wed Mar 18, 2015 8:37 am
by John
I think the older version may have counted within 3 words of the center of the match, i.e. within a 6 word span.