excel files not showing in search results

Post Reply
hqweb
Posts: 22
Joined: Wed Sep 03, 2003 3:59 am

excel files not showing in search results

Post by hqweb »

Hi,
We have installed the update for webinator to show word and excel titles in the search results. It works perfectly for word but not for excel files. No excel files are returned in the search results.
The .xls and .csv extension is included in the list of extensions to be indexed, and when we enter the full path to the excel file in the "list/edit urls" we see that the files has been indexed.
Any ideas why the excel files are not showing up in the search results.
thanks.
Niamh
User avatar
mark
Site Admin
Posts: 5519
Joined: Tue Apr 25, 2000 6:56 pm

excel files not showing in search results

Post by mark »

Their text doesn't contain what you're searching for.

In the list/edit urls when you see an excel file click on it to get all the info about that file and the content that was extracted.
hqweb
Posts: 22
Joined: Wed Sep 03, 2003 3:59 am

excel files not showing in search results

Post by hqweb »

ok, i've clicked on the excel file in the list/edit urls and the content shows the title as "xls Document (71k)" even though when I open the excel file directly (with excel) the title (under file properties) is "OCT-01 COMPENDIUM - PART 1"

So the title is not being indexed for some reason.?

Niamh
User avatar
mark
Site Admin
Posts: 5519
Joined: Tue Apr 25, 2000 6:56 pm

excel files not showing in search results

Post by mark »

Looks like a glitch in the formats.rule file for excel files. Edit INSTALLDIR/conf/formats.rule . On the lines that contain "vnd.ms-excel" change "%IN%" to "%ANYTOTXFLAGS% %IN%".
hqweb
Posts: 22
Joined: Wed Sep 03, 2003 3:59 am

excel files not showing in search results

Post by hqweb »

Hi,
I have modified this file and now I'm getting inconsistencies in the search results. I have two excel file - both of which have a title defined in the properties of the file. When I do a search on these files (using the title or part thereof) one of the file titles shows up in the search results, but the other one just says xls document (both of these documents show up in list/edit urls).

I noticed in the mime.type file in conf directory excel is defined as
application/vnd.ms-excel xls
Is this correct? should there be a csv in here also?
thanks.
Niamh
User avatar
mark
Site Admin
Posts: 5519
Joined: Tue Apr 25, 2000 6:56 pm

excel files not showing in search results

Post by mark »

And of course you did a new rewalk of the site, right?
For the purposes of this ignore the search, it will show whatever was walked. Focus on the list/edit urls which will tell you everything about the file.

Can you provide urls to the 2 documents in question?
You can make the message private if you don't want the general public seeing them.

csv is not a proprietary MS excel format. It's plain text and doesn't need a special mime type. And that particular config file is not relevant to web walking.

Depending on what the webserver returns for the documents there's a small chance they are being handled slightly differently. In dowalk's doplugin function find where it says "<case "application/vnd.ms-excel">". Change -fmsw to -fxls and add the following line below that one
<case "application/vnd.ms-excel"><dofilt opt=$xlsopt dt="MSExcel"><return>
And the line where it says "<case "xls">" change -fmsw to -fxls.
(or get the latest dowalk and webinatoradmin from the webinator examples page)
hqweb
Posts: 22
Joined: Wed Sep 03, 2003 3:59 am

excel files not showing in search results

Post by hqweb »

Hi,
I've made the above changes in dowalk's doplugin function and now it works. Thanks very much.
Niamh
UNHCR
Post Reply