getting top sites

Post Reply
KMandalia
Posts: 301
Joined: Fri Jul 09, 2004 3:50 pm

getting top sites

Post by KMandalia »

We are crawling some 50-60 websites. I am generating a report on query log table that gives me top search terms, top search hits (web pages) etc.

What I want to see is top 5 or 10 websites whose webpages were clicked by users.

I am not quite sure that querylog or html tables can help in this regard.How do I implement this feature?
User avatar
mark
Site Admin
Posts: 5519
Joined: Tue Apr 25, 2000 6:56 pm

getting top sites

Post by mark »

I think you're looking for something like this
"select count(*) Count,sandr('http://=[^/]+.*','\1\2',Query) Site from querylog where Info matches 'what=u%' group by sandr('http://=[^/]+.*','\1\2',Query) order by 1 desc"
KMandalia
Posts: 301
Joined: Fri Jul 09, 2004 3:50 pm

getting top sites

Post by KMandalia »

It did the job but I don't understand how you formed the expression..

Since we are walking domains, I want to take out the www and instead match only somesite.com part of http://www.somesite.com. How do I do that??
User avatar
mark
Site Admin
Posts: 5519
Joined: Tue Apr 25, 2000 6:56 pm

getting top sites

Post by mark »

See the docs for sandr and rex. Let us know if you have specific syntax questions.

Change the sandr's to

sandr('http://=www\.?[^/]+.*','\3',Query)
Post Reply