some files with non-alphanumeric filenames not indexed

Post Reply
bogo
Posts: 2
Joined: Tue Dec 05, 2000 6:58 pm

some files with non-alphanumeric filenames not indexed

Post by bogo »

I have a bunch of files named
ACL2-PC::TH.html
ACL2-PC::TYPE-ALIST.html
etc.

and none of them get indexed by gw. Is this a bug in gw? Is there a known workaround?

Other filenames that appear not to work:
files with "<" or ">" sign in them,
e.g. STRING>=.html

-bogo
User avatar
Kai
Site Admin
Posts: 1272
Joined: Tue Apr 25, 2000 1:27 pm

some files with non-alphanumeric filenames not indexed

Post by Kai »

Are these files in URLs that are given to gw or linked from pages it is indexing? If not, gw cannot know where they are to index them. Like a browser, gw can only see URLs it is given directly, or URLs linked from those pages, etc.

Are the links properly HTML formatted (eg. double-quoted and HTML-escaped if needed)? Punctuation is acceptable in links, as long as the links are properly formatted.

Can you give us a URL to a page where these links occur?
User avatar
mark
Site Admin
Posts: 5519
Joined: Tue Apr 25, 2000 6:56 pm

some files with non-alphanumeric filenames not indexed

Post by mark »

They are invalid for various reasons.

href="ACL2-PC::TH.html" : anything up to the : is expected to be a protocol (eg http:). Coded with a protocol they will work: href="http:ACL2-PC::TH.html"

href="ACL2-PC>TH.html" : > is not properly html escaped. Escaped they will work: href="ACL2-PC>TH.html"

Files with < do work, but should also be escaped.
Post Reply