Indexing and Searching PDF files

User avatar
Thunderstone
Site Admin
Posts: 2504
Joined: Wed Jun 07, 2000 6:20 pm

Indexing and Searching PDF files

Post by Thunderstone »



What is the current status of indexing PDF files for search via
Webinator? The latest info that I have is that Adobe is working on an
API. I am with the State of Louisiana and have a requirement to get
something to replace the workaround that we have been using with
Webinator. We are presently using the commercial version of Webinator
and am running the software under Linux.

Also, I could not find any documentation on the Adobe Acrobat Plug-In
other than it has to be purchased. When I searched for PDF I could
not find anymore info other than what you have under your tech support
archive.

Please advise, thanks.

User avatar
Thunderstone
Site Admin
Posts: 2504
Joined: Wed Jun 07, 2000 6:20 pm

Indexing and Searching PDF files

Post by Thunderstone »




Adobe's Acrobat API, and therefore the Webinator PDF plugin, is
available only on these platforms:
SunOS, Solaris, HP-UX, Windows NT/Windows95 (Win32), SGI, and AIX

We have heard of someone using a Linux Webinator and accessing the
PDF filter on another platform via rsh or rexec. Not ideal, but feasable.

There's not much to say about the PDF plugin. You use the -n option of gw
to access it. PDF files are passed to it and it returns the ascii form
of the document to gw for indexing. See:
http://www.thunderstone.com/gwman/node21.html

The current Adobe API has minor issues with some PDF files.
They are nearing release of a new API which seems to resolve those
issues and add support for Acrobat 3. We will provide a newer
plugin when the API is stable.