Andrew, This posting is 5 years old. At that time we did not endeavor to support Free Webinator in non English installations. Even though it was capable, (Webinator was the basis for several popular whole web search engines) we were focused on the explosive for-profit side of the business.
Every character set is indexable. The solution only becomes more complex when mixing ISO, ANSI, and several Unicode variants within the same database. In this case, the method we choose becomes somewhat application dependent. The reason it's complex is because the user might type an ISO query to be resolved simultaneously against both Unicode and ISO documents.
Our indexes are defined purely by specification of one or more Regular Expressions. With proper definition of the set of indexing expressions, any character set or character set combination may be indexed. The technical decisions to be made here require balancing index-size against performance as the same string might be indexed several different ways.
For the ISO Webinator answer see:
http://thunderstone.master.com/texis/ma ... m36f66de30
Also see:
http://www.thunderstone.com/texis/site/ ... ssues.html
Much more background information is available by searching for "index expression" on our site.
Any vendor who purports to have a single magic bullet solution to all the problems that can occur within this space is either misinformed or lying. The reason I pointed you to eBay Japan as an example is because the Japanese text is a mix of three different character set styles; Kana, Kanji, and ASCII. As I said before, Thunderstone consulting is more than capable of helping you work through any of these issues. Please contact our sales dept. to arrange for this.