Data from field

Post Reply
jgdoke
Posts: 167
Joined: Wed Jul 14, 2004 10:52 am

Data from field

Post by jgdoke »

I would like to have a searchable and sortable field populated from text on the web page.
This page lists the language of the document:
http://literature.rockwellautomation.co ... bSize=4313
Here is the source code from two different publications.
First pub is Japansese
<tr id="xLanguage_row">
<td width="30%" align=right><span class=infoLabel>Language: </span></td>
<td width="70%">



<span class="tableEntry" title="Japanese">Japanese</span><!--'"-->



</td>
AND this pub is English:

<tr id="xLanguage_row">
<td width="30%" align=right><span class=infoLabel>Language: </span></td>
<td width="70%">



<span class="tableEntry" title="English">English</span><!--'"-->



</td>

What REGEX would parse the language out?
Then how do I implement this in a crawl?
Thanks
John
User avatar
mark
Site Admin
Posts: 5513
Joined: Tue Apr 25, 2000 6:56 pm

Data from field

Post by mark »

For Additional Fields:Name use "Language" (no quotes), Type "Text". Save those settings but don't start the walk. Then in Data from Field:Rex use
>>class\=infoLabel>Language: </span></td>=\space*<td width\="70%">=\space*<span class\="tableEntry" title\="=[^"]+">\P=[^<]+\F</span>
leave Replace blank, set From Field to "HTML", leave From Meta Field blank, set To Field to "Language".
Post Reply