Page 1 of 1

relative path links

Posted: Thu Jan 31, 2008 5:59 pm
by aitchon
I'm fetching a page that has links with relative paths. When I run the following code, I get a link that looks like http://testurl.com/dir1/dir2/filename/. Instead the path should be http://testurl.com/dir2/filename. (I don't want to use urllinks off of the fetch of the $mainpath page, since I'm having problems extracing links that way.) How can I make the <fetch $mainpath $htmllink> statement work to give me the correct path?

<$mainpath="http://testurl.com/dir1/">
<fetch $mainpath>
<$htmlpage=$ret>
<!-- sample relative path href would be "dir2/filename/" -->
<rex row "<a\space=!</a>+</a>" $htmlpage>
<$htmllink=$ret>
<fetch $mainpath $htmllink>
<urllinks>
<loop max=1 $ret>
<$relativepathlink=$ret>
</loop>
</rex>

relative path links

Posted: Fri Feb 01, 2008 10:32 am
by mark
Given a url of "http://testurl.com/dir1/" and an href of "dir2/filename/" the correct full url is "http://testurl.com/dir1/dir2/filename/" so I don't see the problem.

Also, what kind of problem are you having using urllinks directly from the first fetch? You're doing a ton of extra work to accomplish the same thing.

relative path links

Posted: Fri Feb 01, 2008 10:42 am
by aitchon
If you look at this page http://www.autoexperience.de/20-Audi-TT-Forum/, the href's look like:

<a href="thread/117/Auch-Gaeste-koennen-Themen-und-Beitraege-erstellen/">Auch G&auml;ste k&ouml;nnen Themen und Beitr&auml;ge erstellen!!</a>

When I do <fetch $mainpath $htmllink> using these 2 values, I get:

http://www.autoexperience.de/20-Audi-TT ... erstellen/

But the value should be:

http://www.autoexperience.de/thread/117 ... erstellen/

This is the value that urllinks produces. The reason I'm not using urllinks is that I've found it unreliable on a page that might have broken html.

relative path links

Posted: Fri Feb 01, 2008 10:46 am
by John
The page itself has a base in it:

<base href="http://www.autoexperience.de/" />

so relative links will be relative to that, not the page's URL. <urllinks> sees that, and gets the right answer.