getting pages walked per base url

KMandalia
Posts: 301
Joined: Fri Jul 09, 2004 3:50 pm

getting pages walked per base url

Post by KMandalia »

I want to generate a report that will give me no. of pages walked per base url.

I am not quite sure if webinator saves base urls somewhere and if it doesn't,the only way to get the base urls is to do some rex filtering so that I only get back the string upto the first '/'.

Let me know how I can query the html table "efficiently" that would give me the accurate no. of pages walked per base url.

CAN TESTDB AND OPTIONS TABLE HELP?
User avatar
John
Site Admin
Posts: 2622
Joined: Mon Apr 24, 2000 3:18 pm
Location: Cleveland, OH
Contact:

getting pages walked per base url

Post by John »

The options table in testdb will be able to get a list of base urls. Depending on the URLs you use as base urls you may be able to loop over them, and uses matches, e.g.

select count(*) from html where Url matches 'BASEURL%';

where the % is the wildcard.
John Turnbull
Thunderstone Software
KMandalia
Posts: 301
Joined: Fri Jul 09, 2004 3:50 pm

getting pages walked per base url

Post by KMandalia »

ok,

what would be the name of the setting that I would use to pull the base urls from options table in the testdb for my profile?
User avatar
mark
Site Admin
Posts: 5519
Joined: Tue Apr 25, 2000 6:56 pm

getting pages walked per base url

Post by mark »

SSc_url

Hint: On the settings page look at the url for the ? next to a setting. At the end there's #h_SETTINGNAME
KMandalia
Posts: 301
Joined: Fri Jul 09, 2004 3:50 pm

getting pages walked per base url

Post by KMandalia »

got it.

thanks for the hint !
User avatar
mark
Site Admin
Posts: 5519
Joined: Tue Apr 25, 2000 6:56 pm

getting pages walked per base url

Post by mark »

Also, the settings in multi-line boxes generally need extra processing. See dowalk's applysettings function for how it processes them. eg SSc_url needs to be broken up into a list:
<split nonempty "\space+" $SSc_url></split>
<$SSc_url=$ret>
Then you can loop over them with
<loop $SSc_url>
...
</loop>
KMandalia
Posts: 301
Joined: Fri Jul 09, 2004 3:50 pm

getting pages walked per base url

Post by KMandalia »

Trying to get no. of pages walked per base url. I must be doing something wrong. Can you point it out for me?

<script language=vortex>
<timeout=-1></timeout>
<a name=main public urllisting mydb>
<SQL "select String from options where Profile='myprofile' and Name='SS_db'"></sql>
<$mydb=$string>
<SQL "select String from options where Profile='myprofile' and Name='SSc_url'"></sql>
<$urllisting=$string>
<split nonempty "\space+" $urllisting></split>
<$urllisting=$ret>
<loop $urllisting>
$urllisting
<sql db=$mydb "select count(*) No from html where Url matches '%$urllisting%'">
$urllisting,$No
</sql>
</loop>
</a>
</script>
User avatar
John
Site Admin
Posts: 2622
Joined: Mon Apr 24, 2000 3:18 pm
Location: Cleveland, OH
Contact:

getting pages walked per base url

Post by John »

In the quotes the $urllisting loses its meaning. You probably want:

<strfmt "%s%%" $urllisting><$pattern=$ret>
<sql db=$mydb "select count(*) No from html where Url matches $pattern">
John Turnbull
Thunderstone Software
KMandalia
Posts: 301
Joined: Fri Jul 09, 2004 3:50 pm

getting pages walked per base url

Post by KMandalia »

thanks, John.

It did the job. Very Cool.
KMandalia
Posts: 301
Joined: Fri Jul 09, 2004 3:50 pm

getting pages walked per base url

Post by KMandalia »

if I split the base url as explained and then take out the http://www. part of it out and if I then do the following

<sort $urllisting>
<$urllisting=$ret>

it doesn't sort... What am I doing wrong?
Post Reply