Page 1 of 1
Error while fetching documents with "shift_jis" charsets
Posted: Tue Sep 29, 2009 6:02 am
by Maulik
Hi,
I am fetching Japanese documents with "shift_jis" charsets. Following errors appear:
- Cannot convert charset shift_jis to UTF-8 via converter /usr/local/morph3/etc/iconv-wrapper: No such file or directory in the function httransbuf
- Cannot completely convert charset UTF-8 to UTF-8: Invalid character sequence at source offset 109419 in the function htutf8_to_utf8
I am working in Solaris 10 and Commercial Version 5.01.1231219356 20090106 (sparc-sun-solaris2.8-64-64) of Texis.
Am i missing any file or library? Please provide your suggestions.
Regards,
Maulik
Error while fetching documents with "shift_jis" charsets
Posted: Tue Sep 29, 2009 10:30 am
by Kai
`etc/iconv-wrapper' is not the standard Webinator charset converter. Did you perhaps wrap the standard etc/iconv executable with a wrapper script called etc/iconv-wrapper at some point, then delete it?
Check that the etc/iconv-wrapper exists and works properly. Or edit texis.cnf and comment out your [Texis] Charset Converter setting (it was probably edited) to fall back to the standard etc/iconv (you will then need to restart vhttpd if you are using it). Make sure that the etc/iconv executable exists in your install dir.
The second error (`Cannot completely convert charset UTF-8 to UTF-8') probably indicates corrupt UTF-8 data, or data incorrectly labeled as UTF-8 when it is not. Can you provide the URL where that error occurred?
Error while fetching documents with "shift_jis" charsets
Posted: Wed Sep 30, 2009 1:10 am
by Maulik
I have not written any wrapper for iconv and there exists no setting in the code which would point to "etc/iconv-wrapper". The standard file "iconv" exists at morph3/etc.
I have not edited the default charset setting in texis.cnf. Currently, texis.cnf has the following settings for charsets (it is commented):
; Charset Converter is an external program and arguments that translates
; its stdin in one charset to stdout in another. The variables
; %CHARSETFROM% and %CHARSETTO% will be replaced with charsets when run;
; embedded-space args may be double-quoted:
;Charset Converter = "%INSTALLDIR%/etc/iconv" -f %CHARSETFROM% -t %CHARSETTO% -c
Is there any other way that texis could be directed to use the default iconv? If there is no way to prevent "iconv-wrapper", then please describe what should be the content of that wrapper.
Error while fetching documents with "shift_jis" charsets
Posted: Wed Sep 30, 2009 10:29 am
by mark
The default setting of just "iconv" is clearly getting overridden somewhere. Check the code again looking for just "iconv-wrapper". Don't forget to check any modules that you may be using.
Error while fetching documents with "shift_jis" charsets
Posted: Wed Sep 30, 2009 11:09 am
by John
You may want to open a ticket through our Tech Support page. It looks as if in the past we have explained the usage of <urlcp charsetconverter> to other members of your organization.
Error while fetching documents with "shift_jis" charsets
Posted: Thu Oct 01, 2009 12:51 am
by Maulik
I have found that "iconv-wrapper" was getting called through <urlcp charsetconverter> in a module, so removed it. Application is working correctly as expected.
My company is using your product from past many years and many people have worked on it, so quite possible some one might have asked similar question. Thanks a lot!!