Page 1 of 1

Query re Duplicate report

Posted: Tue Oct 09, 2007 8:19 pm
by legedza.henry
Hi there,

Just ran a reindex on one of our sites and their were a series of duplicates reported.

I am somehwat confused why the following files are reported as duplicates.

The link: http://www.decs.sa.gov.au/accountabilit ... 090505.pdf

Is a duplicate of: http://www.decs.sa.gov.au/accountabilit ... 230807.pdf

Referenced by: http://www.decs.sa.gov.au/accountabilit ... avgrp=2239

There are a number of similar occurrences where the two files are not in any way similar other than being pdf files.

Query re Duplicate report

Posted: Tue Oct 09, 2007 10:17 pm
by mark
Look in all walk settings to see what "Duplicate check fields" is set to. Then lookup the url in the database using "List/Edit Urls" to find what's in those fields. Maybe those PDFs are just scans without any text?