So instead of the above <pre>...</pre> around <manglepage> what would it be? The existing snippet messes up the GUI.
In any case, if I look at the logfile which is readable (thanks for the filename tip), I see that the sandr doesn't appear to be doing any good as my remove word is still there .
<pre>s_removewords[0]="[^\alnum]\P=>>MYWORDHERE=\F[^\alnum]"
s_removewords[1]=">>= "
s_removewords[2]=" =>>="
html before=...
etc.
<local hb he>
<$needmangle=N><!-- do we need to process HTML before extracting
was AFTER my code which was resetting needmangle back to N, and thus the code wasn't getting called in <manglepage> . I moved my (your!) code to after that section ending in </local>, and it appears to do the trick!
Ok, one related question... I tried using the 'ignore tags' feature for this but it didn't quite work as expected, and ended up wiping huge chunks of my resulting HTML.
If I want to remove all data between this:
Start:
<b>This is my start: ...
End:
... my ending</a>
What would be the easiest way of doing that? I want to delete the smallest match possible anywhere it exists on the page, so it doesn't get too greedy on accident.
Ignore tags will delete everything between and including the strings you specify, they need not be just html tags.
The removal is not greedy in the sense that it will take the largest match. It will take the first match. But it is greedy in the sense that if the end tag doesn't exist it will remove to end of file. You can change that by changing the line
<strfmt ">>%s=!%s*%s?" $hb $he $he>
to
<strfmt ">>%s=!%s*%s" $hb $he $he>
Then it will remove only if both begin and end tag exist.