user specified noise list

KMandalia
Posts: 301
Joined: Fri Jul 09, 2004 3:50 pm

user specified noise list

Post by KMandalia »

How to exclude a phrase? Tried enclosing it with " but doesn't work.

Individual words in phrase are not noise words.
User avatar
mark
Site Admin
Posts: 5519
Joined: Tue Apr 25, 2000 6:56 pm

user specified noise list

Post by mark »

Can't have a phrase as a noise word. You could create an equiv for the phrase to replace it with a noise word which would then get removed. eg in a user equiv file put

foo bar=the

So, the query
joe's foo bar script
would effectively become
joe's the script
and "the" would be removed.
KMandalia
Posts: 301
Joined: Fri Jul 09, 2004 3:50 pm

user specified noise list

Post by KMandalia »

That's one way of doing it. Actually I was doing
<apicp noise mynoiselist> where I was able to put anything in mynoiselist. Seems like the script is still doing the exact same thing. Then can you sugest some alternation to the split statement right before that will split at " "\SPACE " instead of just space?
User avatar
mark
Site Admin
Posts: 5519
Joined: Tue Apr 25, 2000 6:56 pm

user specified noise list

Post by mark »

You've lost me completely. Maybe some more detail is in order.
KMandalia
Posts: 301
Joined: Fri Jul 09, 2004 3:50 pm

user specified noise list

Post by KMandalia »

Sorry.

In the past versions of search script, in init function I would put the following

<$mynoiselist= "noise" "more noise" "lots of noise">
<apicp noise $mynoiselist>

In 5.1.23 search, you are doing the same thing

<split NONEMPTY "\space+" $SSc_noiselist></split>
<if $ret neq ''><apicp noise $ret></if>

My question is instead of splitting the noise list in the admin interface at white space which only allows single word noise list, what is the syntax for splitting noise words at " so that instead of entering noise words separated by white space, I will enter noise words enclosed in "" and every time I find "[WORD/PHRASE]"SPACE"[WORD/PHRASE]" I will split at "SPACE?

BTW, I tried using equivs, didn't work for me.
User avatar
mark
Site Admin
Posts: 5519
Joined: Tue Apr 25, 2000 6:56 pm

user specified noise list

Post by mark »

No. Putting spaces in noise terms in the apicp call won't do anything. eg noise term of 'foo bar' would not be removed from query 'joes foo bar script'. And I wouldn't expect it to be removed from 'joes "foo bar" script' either.

How did you try the equivs?
KMandalia
Posts: 301
Joined: Fri Jul 09, 2004 3:50 pm

user specified noise list

Post by KMandalia »

Well. To give you an example, for us 'credit union' is a noise phrase but 'credit' and 'union' are not.

Check out the following with query 'credit union':

http://search.creditunions.com/scripts/ ... onindustry

See, you don't get any meaningful results (for some reason it still shows some junk which it does not for single noise words. May be you can help me with that)

Now, try the 5.1.23 script with 'credit union'

http://search.creditunions.com/scripts/ ... onindustry.

I am using eqvsusr.lst and backref to create eqvsusr and then use user equivalance file command etc. In short, it works (try chip, you will get charles). When I say creditunion=credit union, as per your comment above since I have creditunion as noise word, credit union becomes one too and should be ignored. But, it doesn't.

Obviously any noise word/phrase looses its effect when enclosed in double quotes. That I know.
User avatar
mark
Site Admin
Posts: 5519
Joined: Tue Apr 25, 2000 6:56 pm

user specified noise list

Post by mark »

You should use
credit union=the
in the equiv file. (as long as "the" is still in your noise list)
KMandalia
Posts: 301
Joined: Fri Jul 09, 2004 3:50 pm

user specified noise list

Post by KMandalia »

1) what is the difference between the and creditunion if both are in my noise list?

2) Does it really matter if the is on RHS or LHS in eqvsusr.lst? If yes, why?

3) So, for my purpose, there is no way to change the SPLIT to match "[WHITESPACE] instead of just [WHITESPACE] as its done now?
User avatar
mark
Site Admin
Posts: 5519
Joined: Tue Apr 25, 2000 6:56 pm

user specified noise list

Post by mark »

1) none. but your usage in the equiv was wrong.
2) yes. that's the way = works.
3) Right, noise words may not contain spaces.
Post Reply