We have some pages that have character set issues, and I'm not sure if there is any easy way to resolve them.
In the walk settings (we are using the Search Appliance), we have the storage charset and source default charset both set to WINDOWS-1252. XML UTF-8 is set to N.
In the search settings, we left the display charset blank.
This seemed to resolve most of our charset issues. Looking at the HTML pages where we still have problems, though, it seems to only be happening in places where a number entity is used rather than the HTML version. For example, in the same file, it has ™ in one location and ® in another. The ® is resolving correctly, but the ™ is not.
Other than going into each file and changing all of the references, does anyone have any ideas for fixing this?
In the walk settings (we are using the Search Appliance), we have the storage charset and source default charset both set to WINDOWS-1252. XML UTF-8 is set to N.
In the search settings, we left the display charset blank.
This seemed to resolve most of our charset issues. Looking at the HTML pages where we still have problems, though, it seems to only be happening in places where a number entity is used rather than the HTML version. For example, in the same file, it has ™ in one location and ® in another. The ® is resolving correctly, but the ™ is not.
Other than going into each file and changing all of the references, does anyone have any ideas for fixing this?