Hello:
I have one website from Taiwan that is a good page about International
Trade. I first visited the website (http://tptaiwan.org.tw/) and
discovered that the first page contains two clickable images:
http://203.66.210.8/indexsetc.htm ---> links to the second page
written in Chinese
http://203.66.210.8/indexsete.htm ---> links to the second page
written in English
There is no robots.txt under http://tptaiwan.org.tw/, so
I used gw http://tptaiwan.org.tw/ but after the run the robot
just gave me one indexed page, the very first page, nothing from
both second pages (i.e., gw -st "select * from html" only gave
me the text of the first page, and gw -st "select * from refs"
regarded both second pages as references).
Then I assumed that http://tptaiwan.org.tw/ has put its content
on http://203.66.210.8/, so I used gw -o http://tptaiwan.org.tw/
but same results were displayed.
Both http://203.66.210.8/indexsetc.htm and
http://203.66.210.8/indexsete.htm also contain many clickable/linkable
images. When I tried gw http://203.66.210.8/indexsetc(e).htm, the
robot was able to follow the links. So why it didn't do so for
the very first page, http://tptaiwan.org.tw/?
How can Webinator fetch and index any website that contains one
or more clickable/linkable images? How can I configure Webinator
to AUTOMATICALLY follow both second pages of http://tptaiwan.org.tw/,
supposing that I didn't know it has images and different hostnames
for its subsequent pages?
Thank you in advance.
David Chan
______________________________________________________
Get Your Private, Free Email at http://www.hotmail.com