I was doing the same thing as shown in the previous postings of this topic. I had taken the raw document of a URL and was trying to find out text between "DisplayFirstCollInfo('" and "')", but I am getting only the last occurence of the text.
On displaying the raw document, I could see all the "<" and ">" are replaced by ">" and "<". Is it because of that?
This is what I did,
<fetch $u1>
<urlinfo rawdoc>
<$rawdoc=$ret>
<capture>
<rex ">>DisplayFirstColInfo('\P=!')+" $rawdoc>
<$firstcol=$ret>
<loop $firstcol>
$firstcol<fmt "\n">
</loop>
</capture>
<$text=$ret>
Here $text returns only one value.
The TEXIS version has changed since last April. I was using TEXIS 3 then and now I am using TEXIS 4.0. I dont know why but adding ROW to REX did the trick. I should have done it yesterday.
<fetch $u1>
<urlinfo rawdoc>
<$rawdoc=$ret>
<capture>
<rex ROW ">>DisplayFirstColInfo('\P=!')+" $rawdoc>
$ret
</rex>
</capture>
<$text=$ret>
This code gives me all the values.
Once $text is populated I insert the data in a table so that it is searcheable. That code is for LOTUS QUICKPLACE Urls, because they are not like any html page. All the text that is displayed on the web page are written through javascript and the TEXIS crawler ignores javascript. On viewing the source of the document I figured out that the information contained in "DisplayFirstColInfo('" and "')" are important and needs to be searched. So, this is how I am doing it. If you can tell me an alternative way to do it that would be great.