Page 1 of 1

XML Problem.

Posted: Sun May 01, 2016 9:47 pm
by gerry.odea
I'm trying to go from:
<$imports='
recdelim >><GSP
field SN varchar(10) />>\RSN\P\=\x22=[^\x22]+ ""
field EN varchar(10) />>\REN\P\=\x22=[^\x22]+ ""
field Counted varchar(20) />><M>\P=!</M>+ ""
field Pell varchar(20) />>\Rq\P\=\x22=[^\x22]+ ""
field Pell2 varchar(20) />>\Rq\P\=\x27=[^\x27]+ ""
'>
<$imports2='
recdelim >><R\x20N
field ShowURL varchar(40) />><U>\P=!</U>+ ""
field Title varchar(80) />><T>\P=!</T>+ ""
field Abstract varchar(100) />><S>\P=!</S>+ ""
'></a>

to:

<$imports='
xml
field SN varchar(10) GSP/RES/@SN
field EN varchar(10) GSP/RES/@EN
field Counted varchar(20) GSP/RES/M
field Pell varchar(20) GSP/Spelling/Suggestion/@q
field Pell2 varchar(20) GSP/Spelling/Suggestion/@q
'>
<$imports2='
multiple
xml
field ShowURL varchar(40) GSP/RES/R/U
field Title varchar(80) GSP/RES/R/T
field Abstract varchar(100) GSP/RES/R/S
'>

but I keep getting:

<!-- 000 /open/search67:181: XML parsing error reported at byte offset 88 -->
<!-- 000 /open/search67:184: XML parsing error reported at byte offset 88 -->


This is the XML

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE GSP SYSTEM "google.dtd">
-<GSP VER="3.2">
<TM>0.354747</TM>
<Q>gerald odea</Q>
-<Spelling>
<Suggestion q="gerard o'dea"><b><i>gerard</i></b> <b><i>o'dea</i></b></Suggestion>
</Spelling>
-<RES SN="1" EN="10">
<M>11200</M>
<FI/>
-<NB>
<NU>/search?q=gerald+odea&lr=lang_en&safe=active&client=searchalot&hl=en&gl=us&output=xml&ie=UTF-8&oe=UTF-8&prmd=ivns&ei=SKkmV9LMBKW8jgS_95DIDg&start=10&sa=N</NU>
</NB>
-<R N="1">
<U>https://au.linkedin.com/pub/dir/Gerald/O'dea</U>
<UE>https://au.linkedin.com/pub/dir/Gerald/O%27dea</UE>
<T>Top 7 <b>Gerald O'dea</b> profiles | LinkedIn</T>
<RK>0</RK>
<S>View the profiles of professionals named <b>Gerald O'dea</b> on LinkedIn. There are 7 <br> professionals named <b>Gerald O'dea</b>, who use LinkedIn to exchange information,&nbsp;...</S>
<LANG>en</LANG>
-<HAS>
<L/>
<RT/>
</HAS>
</R>
-<R N="2" L="2">
<U>https://www.linkedin.com/in/jerry-o-dea-1a254814</U>
<UE>https://www.linkedin.com/in/jerry-o-dea-1a254814</UE>
<T><b>Jerry O'Dea</b> | LinkedIn</T>
<RK>0</RK>
<S>View <b>Jerry O'Dea's</b> professional profile on LinkedIn. LinkedIn is the world's <br> largest business network, helping professionals like <b>Jerry O'Dea</b> discover inside<br> &nbsp;...</S>
<LANG>en</LANG>
-<HAS>
<L/>
<RT/>
</HAS>
</R>
-<R N="3" L="2">
<U>https://uk.linkedin.com/in/gerardatdynamis</U>
<UE>https://uk.linkedin.com/in/gerardatdynamis</UE>
<T><b>Gerard O'Dea</b> | LinkedIn</T>
<RK>0</RK>
<S>View <b>Gerard O'Dea's</b> professional profile on LinkedIn. LinkedIn is the world's <br> largest business network, helping professionals like <b>Gerard O'Dea</b> discover <br> inside&nbsp;...</S>
<LANG>en</LANG>
-<HAS>
<L/>
<RT/>
</HAS>
</R>
-<R N="4">
<U>http://www.legacy.com/obituaries/southo ... 7946303</U>
<UE>http://www.legacy.com/obituaries/southo ... 946303</UE>
<T>Janet <b>ODea</b> Obituary - Brockton, MA | The Enterprise - Legacy.com</T>
<RK>0</RK>
<BYLINEDATE>1457164800</BYLINEDATE>
<S>Mar 5, 2016 <b>...</b> Janet was the wife of <b>Gerald</b> F. <b>ODea</b>. She was born in Brockton to the late <br> Amadell J. and Joan F. (Baynes) Sebelia. Janet was a 1960&nbsp;...</S>
<LANG>en</LANG>
-<HAS>
<L/>
<C SZ="135k" CID="aVeReY_pl_sJ"/>
<RT/>
</HAS>
</R>
</R>
</RES>
</GSP>

Any suggestions as to what I am doing wrong?

Gerry

XML Problem.

Posted: Sun May 01, 2016 10:12 pm
by gerry.odea
<timport max=1 ROW $actimports $html></timport>
<timport max=1 ROW $actimports $html></timport>
<timport max=10 ROW $actimports2 $html><ORGANICRESULT></timport>


seem to be causing the error, but not sure how to fix them.

XML Problem.

Posted: Sun May 01, 2016 11:24 pm
by gerry.odea
I resolved the previous error, but now when I use:
<timport max=10 ROW $actimports $html><ORGANICRESULT></timport><$keeploop=$loop>
<sandr $search1 $replace1 $Title><$Title=$ret>
<sandr $search1 $replace1 $Abstract>
<$Abstract=$ret>
<if $Title eq "">
<$Title=$ShowURL>
</if>
<sandr $search1 $replace1 $ShowURL>
<tr><td valign=top align=left><a href="$ret" class=result_t $target>
<abstract $Title 65 smart>
<fmt '%mbs' $q $ret></a>
<strfmt %s $Abstract><div class=result_a><fmt "%s" $ret></div>
<strfmt "%s" $ShowURL><$ShowURL=$ret><sandr "http://" "" $ShowURL>
<sandr "https://" "" $ret>
<abstract $ret 65 smart><lower $ret><div class=result_u><fmt '%s' $ret></div>
<if $loop lt 9><div style="height:25px"></div><else><div style="height:6px"></div></if></td></tr>

I get:

the results are all lumped into one result. like it's not parsing it.

I'm using
<$imports='
xml nohtml
trimspace
field SN int GSP/RES@SN
field EN int GSP/RES@EN
field Counted int GSP/RES/M
field Pell varchar GSP/Spelling/Suggestion@q
multiple
field ShowURL varchar GSP/RES/R/U
field Title varchar GSP/RES/R/T
field Abstract varchar GSP/RES/R/S
'>

XML Problem.

Posted: Sun May 01, 2016 11:25 pm
by mark
Not sure off the top but there seem to be stray hyphens in the XML.

XML Problem.

Posted: Mon May 02, 2016 6:46 am
by mark

XML Problem.

Posted: Mon May 02, 2016 9:40 am
by John
You'll want two separate timports, one to get the overall information and one to get result rows. You will need to use xmldatasetlevel to tell timport at which depth a "record" is: https://www.thunderstone.com/site/texis ... _with.html

You may also find the XML API useful: https://www.thunderstone.com/site/vorte ... l_api.html