Excel files with dowalk script

User avatar
mark
Site Admin
Posts: 5515
Joined: Tue Apr 25, 2000 6:56 pm

Excel files with dowalk script

Post by mark »

You must be typing the url incorrectly. See what the url is when you do a search for the file that appears. Use that (minus the leading http://) to find the .doc file. Then use a similar url to find the .xls file.
jeuteneier
Posts: 32
Joined: Wed May 16, 2001 2:54 pm

Excel files with dowalk script

Post by jeuteneier »

I got the Excel document to be recognized by using -fother in the dowalk scipt (thanx John). One more problem tho, when the results come up it doesn't seem to display the title. It just says, "MSExcel Document (309KB)".

How do I get the title recognized?

Thanx,
Justin
User avatar
mark
Site Admin
Posts: 5515
Joined: Tue Apr 25, 2000 6:56 pm

Excel files with dowalk script

Post by mark »

The plugin doesn't currently know how to extract titles from excel files.
jeuteneier
Posts: 32
Joined: Wed May 16, 2001 2:54 pm

Excel files with dowalk script

Post by jeuteneier »

Will that be an update in the new version? Do you have any idea of when that will be available?

Justin
Faiz
Posts: 109
Joined: Wed Jan 10, 2001 1:29 pm

Excel files with dowalk script

Post by Faiz »

Hi,
I could extract the content of an excel file using the option -fother but not using -fmsw (it returns nothing using this option). While -fother works fine but it also gets some junk characters like "B a x X A r i a l A r i a l A r i a l A r i a l A r i a l A
r i a l Red Red x Sheet1 a Sheet2 h Sheet3 i C". Is it possible take out these characters from the return value?? Are there any workarounds?
User avatar
mark
Site Admin
Posts: 5515
Joined: Tue Apr 25, 2000 6:56 pm

Excel files with dowalk script

Post by mark »

You could remove some things with <sandr> but there's not much point and it may cause good hits to be missed. The extracted text is primarily used for searching. You generally only see the part of the text that contained the best match to the query (unless you click on match info).
Faiz
Posts: 109
Joined: Wed Jan 10, 2001 1:29 pm

Excel files with dowalk script

Post by Faiz »

The plugin used in the dowalk script extracts the contents of most of the excel files, but for some, it fails. What could be the reason? Does that have something to do with text formatting or file corruption?
User avatar
John
Site Admin
Posts: 2597
Joined: Mon Apr 24, 2000 3:18 pm
Location: Cleveland, OH
Contact:

Excel files with dowalk script

Post by John »

Most likely it has to do with the way the text is stored in the file. Is it maybe different versions of Excel?
John Turnbull
Thunderstone Software
User avatar
mark
Site Admin
Posts: 5515
Joined: Tue Apr 25, 2000 6:56 pm

Excel files with dowalk script

Post by mark »

Some reasons for failure could be file truncation during download because of too small a -z setting, encryption, graphical text, multi-byte character sets.
Faiz
Posts: 109
Joined: Wed Jan 10, 2001 1:29 pm

Excel files with dowalk script

Post by Faiz »

thanx. what is too small a-z setting? is it the font size?
Post Reply