Proximity search question

rabbott
Posts: 6
Joined: Mon Dec 29, 2014 6:07 pm

Proximity search question

Post by rabbott »

Hi,
I have a question about proximity searching, specifically how words are counted.

Using Texis Version 05.01.1154642055 here.

I created a table and inserted a row:

create table tbltest(ID int,author varchar(50));
insert into tbltest values(1,'Denise Morse-Rothwell Lisa Mealing Michelle Marie Martin Nanette P Drake');
create metamorph inverted index idxmauthor on tbltest(author);
set withinmode='word';
set indexwithin=1;

When I run the following search: "select DOCID from tbltest where AUTHOR like 'Lisa w/3 +Nanette", I get one hit. There are four words between Lisa and Nanette. Either I've done something wrong here, or I don't really understand proximity searching (probably both!)

Any help would be appreciated. Thanks.
User avatar
mark
Site Admin
Posts: 5513
Joined: Tue Apr 25, 2000 6:56 pm

Proximity search question

Post by mark »

Did you get any warnings for that query? I'm guessing you don't have alwithin and alpostproc on.
User avatar
Kai
Site Admin
Posts: 1270
Joined: Tue Apr 25, 2000 1:27 pm

Proximity search question

Post by Kai »

alwithin=1 needs to be set to allow the within (`w/') operator.

Also, you should not need to set indexwithin: it is a set of bit flags, not a boolean, and should default to 7 in your version and later, which enables all possible index usage for within. Setting it to 1 disables the index for withinmode word, which is triggering a post-process to resolve the within, which is not done by default (but which issues a warning) and thus the row is not excluded.
rabbott
Posts: 6
Joined: Mon Dec 29, 2014 6:07 pm

Proximity search question

Post by rabbott »

I didn't get any warnings for the previous query, and when I run the following query I still get one hit.

tsql "set withinmode='word';set alwithin=1;select ID from tbltest where AUTHOR like 'Lisa w/3 +Nanette'"
User avatar
mark
Site Admin
Posts: 5513
Joined: Tue Apr 25, 2000 6:56 pm

Proximity search question

Post by mark »

Implicitly setting "set indexwithin=7;" should do it for your version.
rabbott
Posts: 6
Joined: Mon Dec 29, 2014 6:07 pm

Proximity search question

Post by rabbott »

I wish I could say that did the trick, but I'm getting the same results. I added "set indexwithin=7" and still got a hit (whether or not I included "set alwithin=1")
User avatar
mark
Site Admin
Posts: 5513
Joined: Tue Apr 25, 2000 6:56 pm

Proximity search question

Post by mark »

I'm trying a slightly older version 5 than you. Here's my SQL:
drop table tbltest;
create table tbltest(DOCID int,AUTHOR varchar(50));
insert into tbltest values(1,'Denise Morse-Rothwell Lisa Nanette P Drake');
insert into tbltest values(2,'Denise Morse-Rothwell Lisa Martin Nanette P Drake');
insert into tbltest values(3,'Denise Morse-Rothwell Lisa Marie Martin Nanette P Drake');
insert into tbltest values(4,'Denise Morse-Rothwell Lisa Michelle Marie Martin Nanette P Drake');
insert into tbltest values(5,'Denise Morse-Rothwell Lisa Mealing Michelle Marie Martin Nanette P Drake');
create metamorph inverted index idxmauthor on tbltest(AUTHOR);
set withinmode='word';
set alwithin=1;
set indexwithin=7;
select DOCID,AUTHOR from tbltest where AUTHOR like 'Lisa w/3 +Nanette';

And output:
SQL 1>select DOCID,AUTHOR from tbltest where AUTHOR like 'Lisa w/3 +Nanette';

DOCID AUTHOR
------------+------------+
1 Denise Morse-Rothwell Lisa Nanette P Drake
2 Denise Morse-Rothwell Lisa Martin Nanette P Drake

Note that your version is 8+ years and 2 major revisions old.
rabbott
Posts: 6
Joined: Mon Dec 29, 2014 6:07 pm

Proximity search question

Post by rabbott »

I executed all the statements you ran, and I got 3 hits (DocIds 1, 2, 3). Is it safe to say that this is due to us having an old version of Texis? Any idea what could have changed?

Thanks for all your help on this.
User avatar
mark
Site Admin
Posts: 5513
Joined: Tue Apr 25, 2000 6:56 pm

Proximity search question

Post by mark »

The definition of the within radius may have changed to be inclusive in your version vs. my older test.
User avatar
John
Site Admin
Posts: 2595
Joined: Mon Apr 24, 2000 3:18 pm
Location: Cleveland, OH
Contact:

Proximity search question

Post by John »

I think the older version may have counted within 3 words of the center of the match, i.e. within a 6 word span.
John Turnbull
Thunderstone Software
Post Reply