Yes you are correct, the database entries for a few pages display no body content.
I don't know what the log entries were for those pages. I do see truncated pages and log entries all the time, and for those entries the body content is the first 5000 characters, just like it's supposed to be.
Here are 3 of the urls in question
http://66.92.69.146/tomresume.htm
http://www.catadjuster.org/adjusters/joyce.html
http://www.hendrik-weber.de/vitae-e7_Delaware2.htm
On reviewing them, using view/source, they all have xml tags at the top, even though the pages have htm or html suffix. The pages seem to be created by microsoft word 9.
Here is the gw string being used to populate the database
gw -d/db -noindex -a -r -O -fshtml -fasp -fcfm -t7 -z5000