walk cgi-bin path

sourceuno
Posts: 225
Joined: Mon Apr 09, 2001 3:58 pm

walk cgi-bin path

Post by sourceuno »

I am walking files in the cgi-bin path, but I only want to walk files within that particular path. For example, there may be a link to the home page on that script, but I do not want to walk the home page. Is there any way that I can limit the pages walked to pages within a certain path?
User avatar
mark
Site Admin
Posts: 5519
Joined: Tue Apr 25, 2000 6:56 pm

walk cgi-bin path

Post by mark »

sourceuno
Posts: 225
Joined: Mon Apr 09, 2001 3:58 pm

walk cgi-bin path

Post by sourceuno »

Is the -j option only good for path names? I'm using 2 scripts, but I only want the results from 1 of the scripts to be saved in the database. For example, script2 might have refs to many links to script1. I get the error, "Disallowed path", on the following call:

gw -s -d- -jhttp://www.mysite.com/cgi-bin/script1.cgi "http://www.mysite.com/cgi-bin/script2.cgi"

Would I be able to save the results from script1 without saving the results from script2?
User avatar
mark
Site Admin
Posts: 5519
Joined: Tue Apr 25, 2000 6:56 pm

walk cgi-bin path

Post by mark »

sourceuno
Posts: 225
Joined: Mon Apr 09, 2001 3:58 pm

walk cgi-bin path

Post by sourceuno »

But I need to walk script2.cgi since it has references to script1.cgi. It has the following references to script1.cgi:

script1.cgi?id=1
script1.cgi?id=2
script1.cgi?id=3

I don't want to walk script1 for any other ID's besides 1,2,3 which are referenced in script2. I won't be able to know these ID's unless I walk through script2. Is this possible within Webinator?
User avatar
mark
Site Admin
Posts: 5519
Joined: Tue Apr 25, 2000 6:56 pm

walk cgi-bin path

Post by mark »

That's rather odd. You might be able to contort the scripted walker to do that. Or maybe just delete the script2 pages from the database after walking.

I'm not sure what you're really doing, but you might want to review attachment A of the license agreement at http://www.thunderstone.com/texis/site/ ... llout.html
sourceuno
Posts: 225
Joined: Mon Apr 09, 2001 3:58 pm

walk cgi-bin path

Post by sourceuno »

I have a database on my site with a table of products. My script2.cgi contains a list of all the products with a link for each product to script1.cgi, which is a detailed page for a selected product. I only want to be able to walk the detailed pages so I can retrieve the description of each product, so that a user might be able to search through all the descriptions. Script2 has links to other pages that I don't need, along with links to Script1.

I prefer not to delete the script2 pages from the database after walking since it might take a long time and would seem to be a waste putting them there in the first place. Would modifying the walker to do what I need be complicated?
User avatar
mark
Site Admin
Posts: 5519
Joined: Tue Apr 25, 2000 6:56 pm

walk cgi-bin path

Post by mark »

Probably along the lines of just not storing pages with a url containing "script2". But it sounds like your app would be in violation of at least items 4 and 5 of the previously mentioned attachment A of the license agreement.
sourceuno
Posts: 225
Joined: Mon Apr 09, 2001 3:58 pm

walk cgi-bin path

Post by sourceuno »

If the data that I'm retrieving resides on another server in which I do not own, manage or have no control over the data, would I still be in violation of item 4 on the attachment A of the license agreement?

Exactly how would I be in violation of item 5? I'm not sure if I fully understand what that item means.
User avatar
mark
Site Admin
Posts: 5519
Joined: Tue Apr 25, 2000 6:56 pm

walk cgi-bin path

Post by mark »

In the previous message you said it was yours on your site. Anyhow the gist of item 4 is that you're not allowed to use Webinator as a shoehorn replacement for what should be a full Texis application. If you're indexing pages on your site(s) such that the portion from your database amounts to 30% or less of the overall content indexed by Webinator you're ok.

Item 5 is basically saying we don't want you using Webinator to provide search services for other database software that doesn't have any or has poor search abilities thereby making them look better than they are. Again, if the Webinator database contains largely other site data besides just the database pages you're ok.