I had an application developed for me where I pass search terms to TEXIS for retrieval. I would like to eliminate hits separated by punctuation. Is the following syntax correct and what effect does including the punctuation removal have on the performance versus only specifying character proximity?
Submitted search string with 30 character proximity: word1 word2 /[^\,] {30}
You've got the right idea. It's actually word word w/[^\,]{30} or word word w/[^\punct]{30} .
The performance cost of this is that each record containing the two words must be examined for the proximity. If this is a small number (ie <1000) it won't be noticeable, but if its large (ie >10000) it could slow you down a little.
Will it eliminate hits where the punctuation touches word 1 on the left or word 2 on the right, or only punctuation between word 1 and word 2? I only want it to eliminate hits where the two words are separated by the punctuation.
tsql "select id, mminfo ('house blend W/[^,]{,10}',body,0,1,0) from test"
The capitalized 'W' indicates that you want to include the delimiter in what is searched, and you want 0 to 10 non-comma characters, not exactly 10. What it was doing was looking for 10 characters that were not ',' in a row as the delimiter, and where you had ", house," that doesn't match, so it kept looking further.
If I correctly use the construct word1 word2 W/[^,]{,8} to eliminate hits where word1 and word2 are within 8 characters and separated by a comma, to also eliminate hits with botha comma and a period would I use
word1 word2 W/[^,.]{,8}? Does the punctuation order matter in [^,.] vs [^.,]?