index problem

rkruger · Post by **rkruger** » Fri Jun 08, 2001 4:47 am

Indexing a site with "gw http://www.searchbroker.de" works.
But a site like "www.schnigge.de" or "www.javamagazin.de" dont work. It found no site to index. I've tried things like "-j" and "-y".

Any ideas ?

Thanks !

bart · Post by **bart** » Fri Jun 08, 2001 10:01 am

The problems are different at each site. www.schnigge.de has a large number of pages that are not URL driven. Many of of its links are generated by client side javascript . The crawler will not follow links that require it to execute client side javascript.

www.javamagazin.de does not have any content of its own. All of its content really resides at http://entwickler.com/ . There is only one page actually located at www.javamagazin.de .

Post by **Kai** » Fri Jun 08, 2001 10:58 am

Also, www.schnigge.de has an off-site <IFRAME>; since it is the only <IFRAME> that page fails. You could fetch just that page with -o and -g, then continue the walk without -o and -g to get the rest of the site.