Hi I have a problem restricting the webinator to keep only the urls that begin with a certain string.
OK them manual clearly says that -j is the answer. However, for me, it does not work. I am not sure where the problem lies but when a user goes to our website, they get allocated an anonymous sessionid that gets plugged into the URL just after the domain name... However I don't think this is the problem, but I thought that I might tell you about it just in case.. Ok so an example:
I want to only be concerned with URLs that begin with:
http://myReallyGreatWebSite.com/it/
so when I run Webinator from the command line, I have a little file full of options. One of which is:
jhttp://myReallyGreatWebSite.com/it/
(ie without the dash for the j option)
I have allowed the .htm extension explicitly with fhtm (though I don't think it is required)
There are no other options that either restrict or enable in the cfg file, just things about verbosity, breadth first etc
So when I run it:
gw -dmydatabase -mmyConfigfile http://myReallyGreatWebSite.com/it/index.htm
When it is run, I get the following output:
Adding todo: http://myReallyGreatWebSite.com/it/index.htm
Saving options and URLs to lastrun
http://myReallyGreatWebSite.com/it/index.htm
0: TotLinks: 0, Links: 0/ 0, Good: 0, New: 0 Retrieving
0: TotLinks: 0, Links: 0/ 0, Good: 0, New: 0 Disallowed path(i)
0: TotLinks: 0, Links: 0/ 0, Good: 0, New: 0 Disallowed MIME type
0: TotLinks: 0, Links: 0/ 0, Good: 0, New: 0
So I get a disallowed path as well as disallowed MIME type (though the http header says that content-type is text/html..yeah sure it isn't quite the same as MIME type)
Have you ANY idea of what I or the program or the system is doing wrong ????
I am running GW Version 2.56 (Commercial) on SunSolaris 8.
Neil.
OK them manual clearly says that -j is the answer. However, for me, it does not work. I am not sure where the problem lies but when a user goes to our website, they get allocated an anonymous sessionid that gets plugged into the URL just after the domain name... However I don't think this is the problem, but I thought that I might tell you about it just in case.. Ok so an example:
I want to only be concerned with URLs that begin with:
http://myReallyGreatWebSite.com/it/
so when I run Webinator from the command line, I have a little file full of options. One of which is:
jhttp://myReallyGreatWebSite.com/it/
(ie without the dash for the j option)
I have allowed the .htm extension explicitly with fhtm (though I don't think it is required)
There are no other options that either restrict or enable in the cfg file, just things about verbosity, breadth first etc
So when I run it:
gw -dmydatabase -mmyConfigfile http://myReallyGreatWebSite.com/it/index.htm
When it is run, I get the following output:
Adding todo: http://myReallyGreatWebSite.com/it/index.htm
Saving options and URLs to lastrun
http://myReallyGreatWebSite.com/it/index.htm
0: TotLinks: 0, Links: 0/ 0, Good: 0, New: 0 Retrieving
0: TotLinks: 0, Links: 0/ 0, Good: 0, New: 0 Disallowed path(i)
0: TotLinks: 0, Links: 0/ 0, Good: 0, New: 0 Disallowed MIME type
0: TotLinks: 0, Links: 0/ 0, Good: 0, New: 0
So I get a disallowed path as well as disallowed MIME type (though the http header says that content-type is text/html..yeah sure it isn't quite the same as MIME type)
Have you ANY idea of what I or the program or the system is doing wrong ????
I am running GW Version 2.56 (Commercial) on SunSolaris 8.
Neil.