Page 1 of 1

Searching PDF documents using metadata like pdf file name, date created, date modified, etc

Posted: Mon Oct 13, 2014 6:15 pm
by talla.kumar
Hi All,

I am new to thunderstone. I am trying to find out if it is possible to search PDF documents using metadata like pdf file name, date created, date modified, etc.

Any help will be appreciated

Searching PDF documents using metadata like pdf file name, date created, date modified, etc

Posted: Tue Oct 14, 2014 10:21 am
by jason112
> using metadata like pdf file name

File name can be included in the searchable content, by adding it to the "Index Fields" setting.

> date modified

The Modified Date of the PDF should be automatically extracted and used for things like "Modified Date Greater Than" queries and "order by date".

> date created

You can choose to use date created instead of date modified for "the date" of the pdf if you want. Depending on exactly which Thunderstone product you go with, it's possible to keep another date in a separate user field to be searched or sorted.

Searching PDF documents using metadata like pdf file name, date created, date modified, etc

Posted: Tue Oct 14, 2014 5:18 pm
by talla.kumar
Thanks Jason.

I am working on a Proof of concept so I want to demonstrate that we can search PDF documents using date created for PDF. We are using Webinator 6.1.0, can I do it without creating any custom search page?

Searching PDF documents using metadata like pdf file name, date created, date modified, etc

Posted: Wed Oct 15, 2014 6:18 pm
by jason112
> I want to demonstrate that we can search PDF
> documents using date created for PDF

There's two things you'll need to do.

* First you should change "Result Style" on search settings to something that includes showing the date, otherwise you won't see any ramifications from this.

* Second you can create a "Data From Field" rule that instructs the walk to use the CreationDate as the modified date, like this:

From Field: Meta Field ->
From Meta Field: CreationDate
To Field: Modify Date

The next time a "new" walk is performed, the PDFs should use their creation date instead of modified date.

Searching PDF documents using metadata like pdf file name, date created, date modified, etc

Posted: Fri Oct 17, 2014 1:34 pm
by talla.kumar
I have made changes in webinator as you suggested but it is still not picking the Created Date of PDF document.

I have tried using value for "From Meta Field" as "CreationDate" and "Creation", both did not work.

Do I need to do anything else this to work?

Searching PDF documents using metadata like pdf file name, date created, date modified, etc

Posted: Fri Oct 17, 2014 1:55 pm
by jason112
Please send a message to support at thunderstone.com with an example PDF so we can check the internals and make sure things match up.