How is the $indexcount variable being filled? It seems to be pretty accurate for small counts. At what point does it start estimating the record count?
Whenever a single index cannot completely resolve the query, and post-processing or linear searching is needed to produce the final results from the index results. This may be because of unindexable terms in the Metamorph query (such as special pattern matchers, punctuation or the within operator), or other clauses in the WHERE clause being AND/ORed that cannot be resolved by the same index.
The variables $rows.min and $rows.max are generally set as a lower/upper bound on the result set count; when/if they equal each other, $indexcount is known to be valid. (See the Vortex manual.)
If I query for a phrase like "about yahoo" in my index, the $indexcount is equal to a threshhold that I set using likepindexthresh. The actual count in my table for this phrase is 7. I know "about" is a noise word in my index and is probably a case where this term cannot be resolved by the index. Can you suggest any other way to get an accurate count or have the $indexcount closer to the accurate count without performing a lengthy query to get the actual count?
For "about yahoo" you could keep noise in the index; drop it and re-create it with "set keepnoise=1; create metamorph .....". You will also need to set <apicp keepnoise on> in your search script. Then the index can resolve noise words and $indexcount is more accurate for this query. (You don't need to set keepnoise when updating the index: just at create time, and in the search.)
Yes on both, but if there is significant RAM compared to the size of the index (especially if most/all of the .dat file could fit in RAM), there shouldn't be much if any speed degradation.