gw's -meta option will also extract from PDF files. The metas available from PDFs are:
Author CreationDate ModDate Creator Producer Title Subject Keywords
Can you restrict PDF and other documents to be searched and indexed by Filename and Title only..will add keywords etc later but the indexing of the whole document is not needed
With Webinator 2 you'd have to go back after the walk and clear the Body field with a SQL update statement. Another possibility would be to put a wrapper around anytotx to remove the body content from it's return so gw never stores it.
With Webinator 4 it would be fairly simple to modify the dowalk script to not keep the Body text from the plugin.
texis -s -d /path/to/your/database "update html set Body=''"
gw -d/path/to/your/database -index
The wrapper would involve writing a program or shell script to use as the plugin which would then call the real plugin and strip the body text from it's answer before returning it to gw. Doing this is beyond the scope of free technical support.