Word Doc Titles being ignored

Post Reply
legedza.henry
Posts: 142
Joined: Wed Jul 24, 2002 11:52 pm

Word Doc Titles being ignored

Post by legedza.henry »

Hi there,

I'm somewhat confused re when Webinator recognises the Title Property in Word or pdf files.

For example this document: http://www.decs.sa.gov.au/docs/files/co ... tion_a.doc has the Title property filled in but when indexed Webinator displays it is as simply a MSWORD Document

On the other hand this document: sa.edu.au/files/pages/OCMOG_Governance_Diag_0704.doc displays the appropriate Title when indexed.

Can someone explain what the difference is? We have numerous documents with Title Properties filled in but webinator is only displaying as MSWORD documents.

Thanks
User avatar
mark
Site Admin
Posts: 5519
Joined: Tue Apr 25, 2000 6:56 pm

Word Doc Titles being ignored

Post by mark »

I get the title ok from that doc. What's your version?

texis -version
legedza.henry
Posts: 142
Joined: Wed Jul 24, 2002 11:52 pm

Word Doc Titles being ignored

Post by legedza.henry »

I tried it again and still no title

Here is Texis version info:

Commercial Webinator 5.01.1149606138 winnt
legedza.henry
Posts: 142
Joined: Wed Jul 24, 2002 11:52 pm

Word Doc Titles being ignored

Post by legedza.henry »

User avatar
mark
Site Admin
Posts: 5519
Joined: Tue Apr 25, 2000 6:56 pm

Word Doc Titles being ignored

Post by mark »

It appears to be an issue on windows versions. Modify dowalk to fix it. Change
!\n\n+\n\n
to
!\r\n\r\n+\r\n\r\n
It occurs twice.

Then rewalk.
legedza.henry
Posts: 142
Joined: Wed Jul 24, 2002 11:52 pm

Word Doc Titles being ignored

Post by legedza.henry »

Thanks. That seems to have fixed the problem. Have some more to check so will let you know if any other examples arise.

Henry
legedza.henry
Posts: 142
Joined: Wed Jul 24, 2002 11:52 pm

Word Doc Titles being ignored

Post by legedza.henry »

User avatar
mark
Site Admin
Posts: 5519
Joined: Tue Apr 25, 2000 6:56 pm

Word Doc Titles being ignored

Post by mark »

I guess the previous solution wasn't sufficiently generic. Undo that and make the following change.

Add this line
<sandr "\x0d\x0a" "\x0a" $page><$page=$ret>
just above the line
<rex max=1 row ">>=\alnum=[\alnum\-]+: =!\n\n+\n\n" $page>
legedza.henry
Posts: 142
Joined: Wed Jul 24, 2002 11:52 pm

Word Doc Titles being ignored

Post by legedza.henry »

Thanks.

This change seems to have solved the problem on initial testing.

Will monitor and let you know if the situation changes.
Post Reply