Error in indexing attached documents

cyril.adam
Posts: 8
Joined: Tue Feb 06, 2007 9:58 am

Error in indexing attached documents

Post by cyril.adam »

Hi

I've an issue when I try to walk an internet wezb site that contain word documents.
I've got an issue like

The link : http://www..........file.doc
Had this error: Not in requirements

I can't do any research on words in any documents of the web site.

Do you have an idea on what this issue is due to ?

Thanks
User avatar
John
Site Admin
Posts: 2623
Joined: Mon Apr 24, 2000 3:18 pm
Location: Cleveland, OH

Error in indexing attached documents

Post by John »

Make sure you include .doc in the extension list you want indexed on the basic walk settings page.
John Turnbull
Thunderstone Software
cyril.adam
Posts: 8
Joined: Tue Feb 06, 2007 9:58 am

Error in indexing attached documents

Post by cyril.adam »

Yes of course there is .doc in the extension list :

Here are the extensions defined :
asp .aspx .doc .html .htm .jsp .pdf .php .swf .txt .xls
User avatar
John
Site Admin
Posts: 2623
Joined: Mon Apr 24, 2000 3:18 pm
Location: Cleveland, OH

Error in indexing attached documents

Post by John »

Another possibility is that you have stay under defined, or it is on a different server, so it does not match the required url prefix, or if you have a "Required REX" it doesn't match that.
John Turnbull
Thunderstone Software
cyril.adam
Posts: 8
Joined: Tue Feb 06, 2007 9:58 am

Error in indexing attached documents

Post by cyril.adam »

User avatar
John
Site Admin
Posts: 2623
Joined: Mon Apr 24, 2000 3:18 pm
Location: Cleveland, OH

Error in indexing attached documents

Post by John »

"Stay Under" is an option, if set to yes based on the base url it will only index documents with a prefix:

http://www.pgregister.coe.int/Pompidou_ ... jectfiles/

i.e. it will stay under the directory containing the Base URL.

Set Stay Under to "N" to crawl elsewhere on the server.
John Turnbull
Thunderstone Software
User avatar
jason112
Site Admin
Posts: 347
Joined: Tue Oct 26, 2004 5:35 pm

Error in indexing attached documents

Post by jason112 »

cyril.adam
Posts: 8
Joined: Tue Feb 06, 2007 9:58 am

Error in indexing attached documents

Post by cyril.adam »

I've set Stay Under to No and I've still the same issue...

I do not want only things under http://www.pgregister.coe.int/Pompidou_files/Pro15 but also .../Pro1 2 3 4 5 6 .....

My issue is listed in the "Checking for broken hyperlinks..." of the walk status
Is this issue can be because of spaces included into the document name ?
User avatar
mark
Site Admin
Posts: 5519
Joined: Tue Apr 25, 2000 6:56 pm

Error in indexing attached documents

Post by mark »

Did you do a rewalk mode "new" or "refresh" walk? If refresh it may not have done much if nothing was due. Do a mode new to ensure that everything is redone.
cyril.adam
Posts: 8
Joined: Tue Feb 06, 2007 9:58 am

Error in indexing attached documents

Post by cyril.adam »

I changed the rewalk mode from "refresh" to "new" and it is working well right now
Thank you for your help

Cyril.