Page 1 of 1
Trouble with apostrophes
Posted: Wed Dec 23, 2009 6:15 am
by michel.weber
I'm trying to index french documents, but i can't figure out how to search properly for apostrophes.
for ex. a search for : migration et d'asile
does not find any results, but sugests : migration et asile
When i look at the suggestions, 'asile' is not highlighted.
What am i doing wrong?
Word forms :[\alnum\'\x80-\xff]{1,70}
Language Characters : \alpha\'\x80-\xFF
post-processing is enabled.
Trouble with apostrophes
Posted: Wed Dec 23, 2009 1:30 pm
by Kai
Is the apostrophe the same character (ASCII, U+0027) in both the query and the document? If not, the query term and document word will be considered different and not match.
Can you post the URL to an example document that should match but does not?
Trouble with apostrophes
Posted: Thu Dec 24, 2009 3:24 am
by michel.weber
Yes it's the same.
I have taken the apostrophe out of the 'word forms' and 'language characters', and now "asile" or "d'asile" match correctly.
I also added 'd' to the noise word list which for french makes sense, but somehow it still shows up as highlighted.
I can't provide a link as i'm working on a test machine which is not accessible from the internet.
Trouble with apostrophes
Posted: Wed Dec 30, 2009 1:21 pm
by Kai
Can you open a tech support ticket with a copy of All Walk Settings, a copy of the query, and a copy of an (HTML) page that it should match (as a ZIPed attachment)? We'll take a look.
Trouble with apostrophes
Posted: Wed Mar 17, 2010 10:35 am
by Kai
Fixes for these issues are now in the scripts on our web site. Note that highlighting of a *phrase* such as `"d asile"' (as opposed to the non-phrase words `d asile') will not span apostrophes in the text, because linear matching of phrases will only match whitespace between words (to avoid spanning sentence endings).