XML Problem

Post Reply
gerry.odea
Posts: 98
Joined: Fri Sep 19, 2008 9:33 am

XML Problem

Post by gerry.odea »

I'm having a problem getting data from an XML. I have Commercial Version 2.6.929642470. Here's the problem..

The XML Looks like this...
<GSP VER="3.2">
<TM>0.182013</TM>
<Q>convert excel into comma delimited file</Q>
<PARAM name="adsafe" value="high" original_value="high" />
- <GROUPS>
- <TITLE>
<U>http://groups.google.com/groups</U>
<TEXT>Google Groups results for</TEXT>
</TITLE>
- <GROUPS_RESULT> <U>/group/alt.comp.software.financial</U>
<TEXT>Excel files <b>into</b> Quicken</TEXT> <GROUP_NAME>alt.comp.software.financial<GROUP_NAME>
<ARTICLE_DATE>Jan 18, 2002</ARTICLE_DATE>
</GROUPS_RESULT>
- <ADS>
- <AD n="1" type="text/wide" url="http://www.google.com/aclk?" visible_url="www.stylusstudio.com/CSV-File/" ctc_url="">
<LINE1>Csv File To Excel</LINE1>
<LINE2>Import & Export</LINE2>
</AD>
</ADS>
- <RES SN="1" EN="10">
<M>47800</M>
<FI />
- <NB>
<NU>/search?q=convert+excel+into+comma+delimited+file</NU>
</NB>
- <R N="1" L="1" MIME="text/html">
<U>http://www.ehow.com/how_4449036_import- ... el.html</U>
<UE>http://www.ehow.com/how_4449036_import- ... l.html</UE>
<T>How to Import a <b>Delimited File into Excel</b> | eHow.com</T>
<RK>0</RK>
<S>Importing the <b>delimited file into</b> Microsoft</S>
<LANG>en</LANG>
- <HAS>
<L TAG="link:" />
<C SZ="60k" CID="AeLt86deLa4J" TAG="cache:" />
<RT TAG="related:" />
</HAS>
</R>
</RES>
</GSP>


Then I have this to parse it:

<$imports='
recdelim >><GSP
multiple
field SN varchar(10) />>\RSN\P\=\x22=[^\x22]+ ""
field EN varchar(10) />>\REN\P\=\x22=[^\x22]+ ""
field Counted varchar(20) />><M>\P=!</M>+ ""
field Pell varchar(20) />>\Rq\P\=\x22=[^\x22]+ ""
'>
<$imports1='
recdelim >><AD
multiple
field ADShowURL varchar(35) />>\Rvisible_url\P\=\x22=[^\x22]+ ""
field ADLink varchar(80) /!visible_=>>\Rurl\P\=\x22=[^\x22]+
field ADType varchar(10) />>\Rtype\P\=\x22=[^\x22]+ ""
field ADTitle varchar(40) />><LINE1>\P=!</LINE1>+ ""
field ADAbstract varchar(80) />><LINE2>\P=!</LINE2>+ ""
field ADAbstract2 varchar(40) />><LINE3>\P=!</LINE3>+ ""
'>
<$imports2='
recdelim >><R\x20N
multiple
field Link varchar(40) />><UE>\P=!</UE>+ ""
field ShowURL varchar(40) />><UE>\P=!</UE>+ ""
field Title varchar(80) />><T>\P=!</T>+ ""
field Title2 varchar(80) />><DT>\P=!</DT>+ ""
field Abstract varchar(80) />><S>\P=!</S>+ ""
field Abstract2 varchar(80) />><DS>\P=!</DS>+ ""
field Cache varchar(40) />>\RCID\P\=\x22=[^\x22]+ ""
'>


but I have the <U></U> tags from the <GROUPS></GROUPS> tag being pulled in as if it were under the <R N="1" L="1" MIME="text/html"> tag

Please help me figure this out.
User avatar
John
Site Admin
Posts: 2622
Joined: Mon Apr 24, 2000 3:18 pm
Location: Cleveland, OH
Contact:

XML Problem

Post by John »

You would need to only look at the part between the <R N="1"... and </R> to get those parts. The recdelim simply separates the records, so the content before the first one is the first record.

You could either upgrade to a newer version with more complete XML parsing, or rex for the block first, and then timport on just that portion you should get what you want.
John Turnbull
Thunderstone Software
Post Reply