Hey,
I came across a pretty tricky thought whilst setting up
some metacrawler script. The trouble being, whenever I search
for a term using the metacrawler, e.g. beer, my results look something like:
Lycos:
www.beer.com
www.beer.de
www.beer.com/brew/index.htm
www.beer.org
Yahoo:
www.beer.com
www.beer.net
www.beer.org
and so on, whereby I get loads of duplicate search results.
I want -nice and neat- ONE result per domain, e.g. I want beer.com once as
opposed to beer.com from Lycos, yahoo, and all the others again.
Is there any easy way to cross-check the domains of the results, so
that every result would only get displayed if the domain has not
yet been used in an earlier result ?
Appreciate your help,
Robert Zrim