I had an issue this morning where the disks on the Thunderstone appliance filled to 100% causing subsequent crawls to fail. Oddly, I only have aproximately 5000 pages that are being indexed, so it is by no means an issue where I am pushing the 250,000 page limit.
After contacting Thunderstone, they had me run each crawl manually with the rewalk type set to "new." Unfortunately to do this, I was forced to delete a couple of profiles in order to free up enough disk space to run a crwal of an existing profile. Thankfully I had some test profiles out there that no one would notice if I removed them completely.
After rerunning each profile manually, I reclaimed 67GBs of space, now I'm using only 643MB.
It seems that this same issue will keep coming up unless I manually run each profile to flush out the old information. Unfortunately having the rewalk type set to new does not take effect when running a scheduled crawl, only when doing a manual crawl. I have asked Thunderstone support to address this issue and am still waiting to hear back.
Does anyone know of a way that I can monitor the disk space on the Thunderstone appliance automatically? This feature would definitely be a nice thing to have, especially being that this will deinfitely occur again in the future unless patched.
I would also like to see the ability to view how much total disk space each profile is taking up. What is displayed on the profiles page appears to be only the disk space that the last collection ran is taking up, not the total amount of disk space used. This would allow me to easily see if a single profile is causing the disk space issue and whether or not a setting within that profile can be changed to solve this issue.
Another option to prevent the disk space from filling up and causing issues would be to have the crawls roll automatically. So if there is for some reason 1 or more profiles that have 67GB of old data, that old data would be purged when a new crawl is requesting the disk space. Is there an option that exists to do this that I am missing?
Thanks,
-Tom
After contacting Thunderstone, they had me run each crawl manually with the rewalk type set to "new." Unfortunately to do this, I was forced to delete a couple of profiles in order to free up enough disk space to run a crwal of an existing profile. Thankfully I had some test profiles out there that no one would notice if I removed them completely.
After rerunning each profile manually, I reclaimed 67GBs of space, now I'm using only 643MB.
It seems that this same issue will keep coming up unless I manually run each profile to flush out the old information. Unfortunately having the rewalk type set to new does not take effect when running a scheduled crawl, only when doing a manual crawl. I have asked Thunderstone support to address this issue and am still waiting to hear back.
Does anyone know of a way that I can monitor the disk space on the Thunderstone appliance automatically? This feature would definitely be a nice thing to have, especially being that this will deinfitely occur again in the future unless patched.
I would also like to see the ability to view how much total disk space each profile is taking up. What is displayed on the profiles page appears to be only the disk space that the last collection ran is taking up, not the total amount of disk space used. This would allow me to easily see if a single profile is causing the disk space issue and whether or not a setting within that profile can be changed to solve this issue.
Another option to prevent the disk space from filling up and causing issues would be to have the crawls roll automatically. So if there is for some reason 1 or more profiles that have 67GB of old data, that old data would be purged when a new crawl is requesting the disk space. Is there an option that exists to do this that I am missing?
Thanks,
-Tom