tiff image crawling

Post Reply
EricV
Posts: 1
Joined: Mon May 07, 2001 4:37 pm

tiff image crawling

Post by EricV »

We have lots of Group IV tiff images that have been ocr'd. Need to know how to make Webinator index the text within the tiffs. Can anyone tell me how? Supposedly, Webinator will crawl tiffs just like PDFs, but I can't make it do it. (It's only indexing the LINK to the tiff, from another html page, not the text within the tiff).
User avatar
mark
Site Admin
Posts: 5513
Joined: Tue Apr 25, 2000 6:56 pm

tiff image crawling

Post by mark »

Not sure who told you that webinator crawls tiffs. In general images are ignored. You would have to use the options to enable the tiff's mime type and extension.

Webinator does not do any ocr nor does it know how to extract any text tags that may be in the tiff file. You may get the desired text by using the wordprocessor plugin with the -fother option. If not you'll have to write your own plugin to handle them.
Post Reply