Page 2 of 2

keep/ignore tags

Posted: Mon Oct 16, 2006 1:54 pm
by mark
Perhaps you can use "exclude by field" to recognize those index pages and Pages only but not Links.

keep/ignore tags

Posted: Mon Oct 16, 2006 2:15 pm
by josh104
I hadn't realized that exclude by field allowed links to be followed - thanks for bringing that to my attention.

The only remaining problem I can see with using exclude by field is that it widens the possibility of false positives/negatives as the site being indexed scales.

But still - it's a far improvement from the situation I thought I was facing. Thanks!!

keep/ignore tags

Posted: Mon Mar 26, 2007 11:48 am
by KMandalia
I think I have asked this before and it was working for me before as well.

I have multiple keep tags on some web pages. I was told that the webinator will keep contents between each keep tags. I don't have any ingore tags on those pages.

It turns out that webinator is only keeping the last keep tag instead of all of them. Has something changed in recent dowalk scripts?

keep/ignore tags

Posted: Mon Mar 26, 2007 12:17 pm
by mark
No. That code hasn't been touched in quite a while. Having multiple keep pairs works for me. With content like

abc
<!-- begin index -->
def
<!-- end index -->
ghi
<!-- begin index2 -->
jkl
<!-- end index2 -->
mno

and keep tags of

<!-- begin index --> <!-- end index -->
<!-- begin index2 --> <!-- end index2 -->

The resulting text is "def jkl"

keep/ignore tags

Posted: Mon Mar 26, 2007 12:34 pm
by KMandalia
there goes my issue.

instead of index and index2, I use beginkeep and endkeep multiple times in a page, thinking that every time a keep tag is encountered, content is scrapped and stored in memory while appending to previous content. I was wrong then.

keep/ignore tags

Posted: Mon Mar 26, 2007 1:37 pm
by mark
No. Using the same tag multiple times also works.

abc
<!-- begin index -->
def
<!-- end index -->
ghi
<!-- begin index -->
jkl
<!-- end index -->
mno

and keep tags of

<!-- begin index --> <!-- end index -->
<!-- begin index2 --> <!-- end index2 -->

The resulting text is "def jkl"