Using REX

Faiz
Posts: 109
Joined: Wed Jan 10, 2001 1:29 pm

Using REX

Post by Faiz »

Hi,
I want to grab everything which falls between
First(' and '). This means, if there is First('abcd') and First('xyz'), then I want to grab the text abcd and xyz. Is it possible to do with rex? I can have many First('124') on the document. All i need is the text between (' and ').
Thanx.
bart
Posts: 251
Joined: Wed Apr 26, 2000 12:42 am

Using REX

Post by bart »

If I understand you correctly, then you can search for the middle part and then check for the ends like this:

begin=!begin+>>wanted=!end+end=
Faiz
Posts: 109
Joined: Wed Jan 10, 2001 1:29 pm

Using REX

Post by Faiz »

I have to index some pages which are dynamically generated from Lotus Notes database (called quickplace). When I do a view source, I see that the text displayed on the page is actually written in Javascript. Now, the crawler is unable to grab the text as they are written to the browser from Javascript. But they follow a certain pattern. So, what I am trying to do, is grab the text from the Javascript variables. An example of what the source looks like is,

<script>DisplayFirstColInfo('Six Sigma');ge ( '' );</script></A></td><td class=h-foldercompact-text valign=top>amy sombut</td> <td class=h-foldercompact-text valign=top>02/14/2001</td> </tr><script>gh('Several slides providing an overview of Six Sigma; process maps of how to incorporate Six Sigma tools into DMAIC; sample project.\n&nbsp;',3,1,0,0,0, 1)</script><script>gb(0, 3)</script><tr class=h-folderItem-bg valign=top><script>WriteResponseBar(0,0)</script><td class=h-folderitem-text valign=top><br></td><td class=h-folderitem-text valign=top><script>document.write (GenerateQPObjURLAnchorTag("h_D7BE94FAB94220FB852569F3004C607D", "F42B6E7B6B285ECE852569F3004CC722", "", ""));</script><script>DisplayFirstColInfo('The Methodology Outlined');ge ( '' );</script></A></td><td class=h-foldercompact-text valign=top>amy sombut</td> <td class=h-foldercompact-text valign=top>02/14/2001</td> </tr><script>gh('\n\nThis is still a work-in-process, however, it should serve as a guide when working with IT professionals and scoping out projects.\n\nhttp://webd01.corporate.ge.com/projectplan/dev_meth.html',3,1,0,0,0, 1)</script>

In the above example, I want the text between DisplayFirstColInfo(' and '). In the first line, the text i want will be 'Six Sigma' and one more after that, 'The Methodology Outlined' (which is dynamically generated). Is it possible to get it using rex or something? How can I give the start limiter and end limiter? Or shall I have to write an awk script to find out?
I hope I am able to make it clear this time.

Thanx,
User avatar
John
Site Admin
Posts: 2623
Joined: Mon Apr 24, 2000 3:18 pm
Location: Cleveland, OH

Using REX

Post by John »

In that example you could follow the example above and do:

<rex ">>DisplayFirstCollInfo('\P=!')+" $Html>
John Turnbull
Thunderstone Software
Faiz
Posts: 109
Joined: Wed Jan 10, 2001 1:29 pm

Using REX

Post by Faiz »

Thanx a lot. But, I am just getting the first occurence of the delimiters. How can I get all the occurences of those delimiters in the page?

Regards,
bart
Posts: 251
Joined: Wed Apr 26, 2000 12:42 am

Using REX

Post by bart »

You have to loop through the rex matches. Try this

<capture>
<rex ">>DisplayFirstCollInfo('\P=!')+" $Html>
<loop $ret>
$ret<fmt "\n">
</loop>
</capture>
<$text=$ret>
User avatar
mark
Site Admin
Posts: 5519
Joined: Tue Apr 25, 2000 6:56 pm

Using REX

Post by mark »

Rex returns a list of all answers, you need to <loop> over them to see them all. Or <sum> them into one string.
Faiz
Posts: 109
Joined: Wed Jan 10, 2001 1:29 pm

Using REX

Post by Faiz »

cool. thanx. i'm getting desired results.
Faiz
Posts: 109
Joined: Wed Jan 10, 2001 1:29 pm

Using REX

Post by Faiz »

Is something wrong with the REX, 'cause even after looping, I am just getting the last occurence.
User avatar
mark
Site Admin
Posts: 5519
Joined: Tue Apr 25, 2000 6:56 pm

Using REX

Post by mark »

REX is fine. Please elaborate on what you're doing and what you're expecting vs. what you're getting and how you're determining what you're getting.