<fmt> is builtin to Vortex and thus is faster than <exec>ing a separate iconv process. Also, you'd need to HTML-escape the iconv output (for XML) after the <exec> with <fmt "%H">, whereas <fmt "%hhV"> does that already. (But note that "%hhV" assumes the *input* has HTML-escaped its ampersands, and must be ISO-8859-1.)
A more generic -- and just as fast -- way to convert charsets is with:
<urlutil charsetconv $datatoconvert $sourcecharset "UTF-8">
<strfmt "%H" $ret> <!-- HTML-escape for XML -->
<$converteddata = $ret>
which will handle any builtin or iconv charset. It will use the same (internal/fast) routines as <fmt> if the charset pair can be handled that way, otherwise it <exec>s iconv. Note the <strfmt> for HTML-escaping for XML.
If the $sourcecharset is equal to "Unknown", is there any attempt to try to convert the data to UTF-8? Or should I not even bother trying to call urlutil charsetconv?
Any charset that is not known internally is punted to an exec'd iconv process. So $sourcecharset of "Unknown" would indeed exec an iconv; you should not bother calling <urlutil charsetconv> then.
I'm using <urlutil charsetconv> to convert a piece of text to UTF-8 from a page that's encoded in ISO-8859-1. It seems that it's having trouble converting the some characters in this string:
gameplay variety to the series — players
is translated to:
gameplay variety to the series — players
Is this the correct conversion to UTF-8? The translated characters do not seem to be valid UTF-8 characters.
It looks like that string was already in UTF-8 (an em dash U+2014); converting a UTF-8 page as if it were ISO-8859-1 will indeed result in incorrect results. Perhaps the page's charset was incorrectly labelled?