strange problem with search term "privacy"

zoeoberon
Posts: 42
Joined: Mon Dec 11, 2000 1:32 pm

strange problem with search term "privacy"

Post by zoeoberon »

I'm having a strange problem that I don't see any obvious answer too.

The query in question is:
select count(*) as total_count from corpus where ( Title\Keywords\Description\Content\MediaFileText likep 'privacy')

If I run this from TSQL I get a count around 12K which I expect. If I run this from a Vortex script, I get a count of zero. There are no messages in the Vortex or error log indicating any errors.

Here're the various setttings at the top of the Vortex script:

<apicp keepeqvs no>
<apicp alpostproc yes>
<apicp alwithin yes>
<apicp qminwordlen 1>
<SQL NOVARS "set hyphenphrase=0"></SQL> <apicp minwordlen 6>
<apicp exactphrase on> (or off depending on double quotes, doesn't matter here, same problem)

Other terms work fine including 'private', 'personal privacy' and 'privacy concerns'. Why would only 'privacy' return zero from the script?

Thanks for you help.
User avatar
Kai
Site Admin
Posts: 1272
Joined: Tue Apr 25, 2000 1:27 pm

strange problem with search term "privacy"

Post by Kai »

I would suspect some other setting is taking effect that we're not aware of.

Assuming you are using a metamorph index, you can set indextrace=50 to see what atomic terms are being looked up in the index, after suffix processing etc. This will produce copies error messages; just save the initial few dozen or so until "allmatch: N and: N ...". "fdbix_seek()" messages indicate words actually found in the index.
zoeoberon
Posts: 42
Joined: Mon Dec 11, 2000 1:32 pm

strange problem with search term "privacy"

Post by zoeoberon »

Okay, here's the results. I'm not sure what this is telling me except that it looks like it found something?

//////////////////////////////////////

/usr/local/morph3/htdocs$ texis -traceidx -traceidx -traceidx -traceidx -traceidx sitename=directmag terms=privacy pbmmsearch/main.txt

[select count(*) as total_count from pbmmcorpus where ( Title\Keywords\Description\Content\MediaFileText likep 'privacy' ) and ( Sitedirname = 'directmag' )]
200 pbmmsearch:336: openfdbi(/pirt-dl/pbmm/db/xcorpus1, R, F) = 0x85FE768
200 pbmmsearch:336: mmap(/pirt-dl/pbmm/db/xcorpus1.tok, 0x0, 0x1AD9B0, R) = 0x402E7000
200 pbmmsearch:336: mmap()ing entire Metamorph index token file /pirt-dl/pbmm/db/xcorpus1.tok in the function openfdbi
200 pbmmsearch:336: Can't mmap() Metamorph index data file /pirt-dl/pbmm/db/xcorpus1.dat: (indexmmap & 2) off; using file I/O in the function openfdbi
200 pbmmsearch:336: 1/2 privacist
200 pbmmsearch:336: 1/1 privacist's
200 pbmmsearch:336: 1/1 privacists
200 pbmmsearch:336: 213884/220650 privacy
200 pbmmsearch:336: allmatch: 1 and: 0 set: 1 not: 0 minsets: 1
200 pbmmsearch:336: kdbf_readchunk(0x11E387BF, 0x10000) = 0x10000

[total_count=0]

No matching records found.<br>

<!-- end Texis -->
200 closefdbi(0x85FE768)
200 munmap(/pirt-dl/pbmm/db/xcorpus1.tok, 0x402E7000, 0x1AD9B0)

//////////////////////////

Here's the index being created:
set indexmem=40;
set delexp=0;
set addexp='[\alnum&\-\x80-\xff]{1,99}';
set addexp='[\alnum&\-\x80-\xff\x27]{1,99}';

create metamorph inverted index xcorpus1 on pbmmcorpus(Title\Keywords\Description\Content\MediaFileText);

////////////////////////

Have to say I'm still baffled - does this mean anything to you?
User avatar
Kai
Site Admin
Posts: 1272
Joined: Tue Apr 25, 2000 1:27 pm

strange problem with search term "privacy"

Post by Kai »

Yes, several words were found in the index that suffix-match `privacy'. So you should get results from the LIKEP.

But I noticed that your SQL now has an AND clause, which may be reducing the results. Try running the same SQL without the AND, exactly as you did in your tsql example.
zoeoberon
Posts: 42
Joined: Mon Dec 11, 2000 1:32 pm

strange problem with search term "privacy"

Post by zoeoberon »

Sorry, I simplified the query when I put up the initial public question to not have the client names. The exact SQL I was using for all of this testing is what you see with the AND clause. In other words, when I run it in TSQL I get 12K results and from Vortex nothing.

Removing the AND clause does return results with the following index trace:

[select count(*) as total_count from pbmmcorpus where ( Title\Keywords\Description\Content\MediaFileText likep 'privacy' )]
200 pbmmsearch:336: openfdbi(/pirt-dl/pbmm/db/xcorpus1, R, F) = 0x85FDB28
200 pbmmsearch:336: mmap(/pirt-dl/pbmm/db/xcorpus1.tok, 0x0, 0x1AD9B0, R) = 0x402E7000
200 pbmmsearch:336: mmap()ing entire Metamorph index token file /pirt-dl/pbmm/db/xcorpus1.tok in the function openfdbi
200 pbmmsearch:336: Can't mmap() Metamorph index data file /pirt-dl/pbmm/db/xcorpus1.dat: (indexmmap & 2) off; using file I/O in the function openfdbi
200 pbmmsearch:336: 1/2 privacist
200 pbmmsearch:336: 1/1 privacist's
200 pbmmsearch:336: 1/1 privacists
200 pbmmsearch:336: 213884/220650 privacy
200 pbmmsearch:336: allmatch: 1 and: 0 set: 1 not: 0 minsets: 1
<snip out kdbf_readchunk>
[total_count=100000]

<!-- end Texis -->
200 closefdbi(0x85FDB28)
200 munmap(/pirt-dl/pbmm/db/xcorpus1.tok, 0x402E7000, 0x1AD9B0)
User avatar
John
Site Admin
Posts: 2622
Joined: Mon Apr 24, 2000 3:18 pm
Location: Cleveland, OH
Contact:

strange problem with search term "privacy"

Post by John »

For questions you don't want to be public you can use the Tech Support link to send a private message to tech support.

If you run the queries with a "set verbose=2;" it will show a little more about how it is processing the query as it appears that the Metamorph index portion is correct.
John Turnbull
Thunderstone Software
zoeoberon
Posts: 42
Joined: Mon Dec 11, 2000 1:32 pm

strange problem with search term "privacy"

Post by zoeoberon »

Here's the same command but with verbose=2 in the script:

<snip>
200 pbmmsearch:337: 1/2 privacist
200 pbmmsearch:337: 1/1 privacist's
200 pbmmsearch:337: 1/1 privacists
200 pbmmsearch:337: 213884/220650 privacy
200 pbmmsearch:337: allmatch: 1 and: 0 set: 1 not: 0 minsets: 1
200 pbmmsearch:337: kdbf_readchunk(0x11E387BF, 0x10000) = 0x10000
<snip kdbf_readchunk>
200 pbmmsearch:337: Looking for index on pbmmcorpus (Sitedirname)
200 pbmmsearch:337: Opening index /pirt-dl/pbmm/db/xcorpus2 in the function ixbtindex
200 pbmmsearch:337: Comparing records
<snip Comparing records>
200 pbmmsearch:337: Expect to read 4% of the index in the function ixbtindex
999 pbmmsearch:337: Handling a table project in the function dotree
999 pbmmsearch:337: Handling a table select in the function dotree
999 pbmmsearch:337: No more rows [0] from pbmmcorpus
999 pbmmsearch:337: Deleting temp row
999 pbmmsearch:337: Handling a table project in the function dotree
[total_count=0]
///////////////////////////
The Sitedirname index is just simply:
create index xcorpus2 on pbmmcorpus(Sitedirname);
which also seems fine.
zoeoberon
Posts: 42
Joined: Mon Dec 11, 2000 1:32 pm

strange problem with search term "privacy"

Post by zoeoberon »

Any further thoughts on this?
User avatar
John
Site Admin
Posts: 2622
Joined: Mon Apr 24, 2000 3:18 pm
Location: Cleveland, OH
Contact:

strange problem with search term "privacy"

Post by John »

If you reorder the clause to put the Sitedirname part first does that help?
John Turnbull
Thunderstone Software
zoeoberon
Posts: 42
Joined: Mon Dec 11, 2000 1:32 pm

strange problem with search term "privacy"

Post by zoeoberon »

I moved the terms to the last condition on the SQL and that did fix the problem. Why would putting it through that index first make a difference?
Post Reply